Google is rolling out new AI models for health care. Here’s how doctors are using them

News

Sundar Pichai, CEO of Google and Alphabet, speaks on artificial intelligence during a Bruegel think tank conference in Brussels, Belgium, on Jan. 20, 2020.

Yves Herman | Reuters

Google on Wednesday announced MedLM, a suite of new health-care-specific artificial intelligence models designed to help clinicians and researchers carry out complex studies, summarize doctor-patient interactions and more.

The move marks Google’s latest attempt to monetize health-care industry AI tools, as competition for market share remains fierce between competitors like Amazon and Microsoft. CNBC spoke with companies that have been testing Google’s technology, like HCA Healthcare, and experts say the potential for impact is real, though they are taking steps to implement it carefully.

The MedLM suite includes a large and a medium-sized AI model, both built on Med-PaLM 2, a large language model trained on medical data that Google first announced in March. It is generally available to eligible Google Cloud customers in the U.S. starting Wednesday, and Google said while the cost of the AI suite varies depending on how companies use the different models, the medium-sized model is less expensive to run. 

Google said it also plans to introduce health-care-specific versions of Gemini, the company’s newest and “most capable” AI model, to MedLM in the future.

Aashima Gupta, Google Cloud’s global director of health-care strategy and solutions, said the company found that different medically tuned AI models can carry out certain tasks better than others. That’s why Google decided to introduce a suite of models instead of trying to build a “one-size-fits-all” solution. 

For instance, Google said its larger MedLM model is better for carrying out complicated tasks that require deep knowledge and lots of compute power, such as conducting a study using data from a health-care organization’s entire patient population. But if companies need a more agile model that can be optimized for specific or real-time functions, such as summarizing an interaction between a doctor and patient, the medium-sized model should work better, according to Gupta.

Real-world use cases

A Google Cloud logo at the Hannover Messe industrial technology fair in Hanover, Germany, on Thursday, April 20, 2023.

Krisztian Bocsi | Bloomberg | Getty Images

When Google announced Med-PaLM 2 in March, the company initially said it could be used to answer questions like “What are the first warning signs of pneumonia?” and “Can incontinence be cured?” But as the company has tested the technology with customers, the use cases have changed, according to Greg Corrado, head of Google’s health AI. 

Corrado said clinicians don’t often need help with “accessible” questions about the nature of a disease, so Google hasn’t seen much demand for those capabilities from customers. Instead, health organizations often want AI to help solve more back-office or logistical problems, like managing paperwork.  

“They want something that’s helping them with the real pain points and slowdowns that are in their workflow, that only they know,” Corrado told CNBC. 

For instance, HCA Healthcare, one of the largest health systems in the U.S., has been testing Google’s AI technology since the spring. The company announced an official collaboration with Google Cloud in August that aims to use its generative AI to “improve workflows on time-consuming tasks.” 

Dr. Michael Schlosser, senior vice president of care transformation and innovation at HCA, said the company has been using MedLM to help emergency medicine physicians automatically document their interactions with patients. For instance, HCA uses an ambient speech documentation system from a company called Augmedix to transcribe doctor-patient meetings. Google’s MedLM suite can then take those transcripts and break them up into the components of an ER provider note.

Schlosser said HCA has been using MedLM within emergency rooms at four hospitals, and the company wants to expand use over the next year. By January, Schlosser added, he expects Google’s technology will be able to successfully generate more than half of a note without help from providers. For doctors who can spend up to four hours a day on clerical paperwork, Schlosser said saving that time and effort makes a meaningful difference. 

“That’s been a huge leap forward for us,” Schlosser told CNBC. “We now think we’re going to be at a point where the AI, by itself, can create 60-plus percent of the note correctly on its own before we have the human doing the review and the editing.” 

Schlosser said HCA is also working to use MedLM to develop a handoff tool for nurses. The tool can read through the electronic health record and identify relevant information for nurses to pass along to the next shift. 

Handoffs are “laborious” and a real pain point for nurses, so it would be “powerful” to automate the process, Schlosser said. Nurses across HCA’s hospitals carry out around 400,000 handoffs a week, and two HCA hospitals have been testing the nurse handoff tool. Schlosser said nurses conduct a side-by-side comparison of a traditional handoff and an AI-generated handoff and provide feedback.

With both use cases, though, HCA has found that MedLM is not foolproof.

Schlosser said the fact that AI models can spit out incorrect information is a big challenge, and HCA has been working with Google to come up with best practices to minimize those fabrications. He added that token limits, which restrict the amount of data that can be fed to the model, and managing the AI over time have been additional challenges for HCA. 

“What I would say right now, is that the hype around the current use of these AI models in health care is outstripping the reality,” Schlosser said. “Everyone’s contending with this problem, and no one has really let these models loose in a scaled way in the health-care systems because of that.”

Even so, Schlosser said providers’ initial response to MedLM has been positive, and they recognize that they are not working with the finished product yet. He said HCA is working hard to implement the technology in a responsible way to avoid putting patients at risk.

“We’re being very cautious with how we approach these AI models,” he said. “We’re not using those use cases where the model outputs can somehow affect someone’s diagnosis and treatment.”

Getty Images

Google also plans to introduce health-care-specific versions of Gemini to MedLM in the future. Its shares popped 5% after Gemini’s launch earlier this month, but Google faced scrutiny over its demonstration video, which was not conducted in real time, the company confirmed to Bloomberg

In a statement, Google told CNBC: “The video is an illustrative depiction of the possibilities of interacting with Gemini, based on real multimodal prompts and outputs from testing. We look forward to seeing what people create when access to Gemini Pro opens on December 13.”

Corrado and Gupta of Google said Gemini is still in early stages, and it needs to be tested and evaluated with customers in controlled health-care settings before the model rolls out through MedLM more broadly. 

“We’ve been testing Med-PaLM 2 with our customers for months, and now we’re comfortable taking that as part of MedLM,” Gupta said. “Gemini will follow the same thing.” 

Schlosser said HCA is “very excited” about Gemini, and the company is already working out plans to test the technology, “We think that may give us an additional level of performance when we get that,” he said.

Another company that has been using MedLM is BenchSci, which aims to use AI to solve problems in drug discovery. Google is an investor in BenchSci, and the company has been testing its MedLM technology for a few months.  

Liran Belenzon, BenchSci’s co-founder and CEO, said the company has merged MedLM’s AI with BenchSci’s own technology to help scientists identify biomarkers, which are key to understanding how a disease progresses and how it can be cured. 

Belenzon said the company spent a lot of time testing and validating the model, including providing Google with feedback about necessary improvements. Now, Belenzon said BenchSci is in the process of bringing the technology to market more broadly.  

“[MedLM] doesn’t work out of the box, but it helps accelerate your specific efforts,” he told CNBC in an interview. 

Corrado said research around MedLM is ongoing, and he thinks Google Cloud’s health-care customers will be able to tune models for multiple different use cases within an organization. He added that Google will continue to develop domain-specific models that are “smaller, cheaper, faster, better.”  

Like BenchSci, Deloitte tested MedLM “over and over” before deploying the technology to health-care clients, said Dr. Kulleni Gebreyes, Deloitte’s U.S. life sciences and health-care consulting leader.

Deloitte is using Google’s technology to help health systems and health plans answer members’ questions about accessing care. If a patient needs a colonoscopy, for instance, they can use MedLM to look for providers based on gender, location or benefit coverage, as well as other qualifiers. 

Gebreyes said clients have found that MedLM is accurate and efficient, but it’s not always great at deciphering a user’s intent. It can be a challenge if patients don’t know the right word or spelling for colonoscopy, or use other colloquial terms, she said. 

“Ultimately, this does not substitute a diagnosis from a trained professional,” Gebreyes told CNBC. “It brings expertise closer and makes it more accessible.”