FDA commissioner urges health systems to strengthen AI quality oversight

 

With the U.S. Food and Drug Administration (FDA) on track to clear its 1,000th clinical artificial intelligence (AI) algorithm by the end of 2024, FDA Commissioner Robert Califf, MD, is highlighting the urgent need for hospitals and health systems to develop robust quality assurance mechanisms for AI. He spoke with HealthExec in an interview at the Transcatheter Cardiovascular Therapeutics (TCT) 2024 annual meeting for interventional cardiology.

Califf emphasized the risks of relying solely on pre-market approvals, noting the potential for AI algorithms to deviate, or "drift," from their original, FDA-cleared functions over time.

“I’m on the AI bandwagon,” Califf said, acknowledging the transformative potential of generative AI and other technologies in healthcare. However, he cautioned that oversight is a complex challenge, one that the FDA cannot tackle alone.

“There's no way the FDA can oversee every algorithm,” Califf explained. “We need an ecosystem where health systems themselves take on the responsibility for ongoing validation.”

He likened the situation to food safety practices under the Food Safety Modernization Act (FSMA), where farmers are expected to follow FDA protocols between inspections, which may only occur every five years, Califf said.

The critical gap in AI post-market validation

Califf stressed that the biggest concern lies in the post-market phase of AI deployment. Unlike traditional medical devices, AI systems can evolve as they interact with new data, leading to either improved or degraded performance. Without continuous validation, health systems risk relying on algorithms that no longer provide accurate or reliable results.

“It’s clear that particularly with generative AI, you put it in place and then it changes,” he said. “The big deal in AI isn’t the pre-market phase, it’s the post-market phase. You need to continuously validate the model in the environment in which you're using it. And unfortunately, there's not a single health system in the U.S. that can do that now. So we probably have some bad algorithms out there already, and we're going to have to figure out how to create this different approach to information technology that enables validation.”

Califf said this gap in quality assurance to check for AI drift poses significant risks, particularly for critical applications like stroke prediction, where changing patient demographics and data can undermine the accuracy of AI-driven predictions.

Barriers to AI validation at hospitals

A significant obstacle to effective AI oversight, according to Califf, is the fragmented nature of healthcare data. While individual patient records can often be accessed across systems, the aggregate data needed to evaluate AI performance is siloed, limiting the ability to assess sensitivity, specificity and overall reliability.

“It’s really culture and data hoarding that’s the issue,” Califf remarked.

He said healthcare systems and even public reporting of data from health departments in each state is still very disconnected and data needs to be pulled in from many disparate data systems. The release of data is also a sea of red tape, because it requires agreements with each health entity in charge of these data silos.

He said the issues was a real problem when assessing the effectiveness of COVID-19 vaccines. Instead of looking at state health data, he found it easier to call Israel where all health data is centralized, where he was able to get answers much faster. 

“To continuously validate an algorithm, you need comprehensive follow-up data on the population it’s applied to—and right now, that’s a capability we simply don’t have,” California said.

A path forward in monitoring AI

Califf called for a fundamental shift in how health systems manage AI, urging the creation of collaborative, data-driven ecosystems that prioritize transparency and accountability. He also underscored the importance of equipping health systems with the technical expertise and resources needed to monitor AI performance over time.

As AI continues to reshape the healthcare landscape, Califf’s warning serves as a timely reminder: Innovation must be matched by vigilance to ensure that technology improves care without introducing new risks.

The FDA plans to continue working with stakeholders to address these challenges, but Califf made it clear that the responsibility must be shared across the entire healthcare ecosystem.

Radiologists echo FDA commissioner's concerns

Since close to 80% of FDA-cleared clinical AI algorithms are for medical imaging, the massive Radiological Society of North America (RSNA) has become the largest clinical AI conference with more than 200 vendors exhibiting these imaging algorithms. The key opinion leaders at the 2024 meeting last week echoed Califf's call for AI oversight.

Among the biggest discussions on AI this year at RSNA was the need to prevent AI bias and to perform quality assurance (QA) assessments to make sure AI is still working as intended. Although AI algorithms are required to be locked down in its final FDA-cleared configuration, some algorithms have been observed to drift over time, and this automation bias makes it less accurate over time.

"When it comes to AI bias, it's not just about your patient population, is also about your equipment or anything that changes the data keeps it out of distribution from what the training data was," explains radiology AI expert Nina Kottler, MD, associate chief medical officer for clinical AI, Radiology Partners in an interview with HealthExec. She said this can be cause when a center gets a new scanner for a software upgrade to the scanner, or a new technology comes in using a different imaging protocol. She added that some on the technical change aspects may even have a bigger impact than differences in race and ethnicity in a hospital's patient population. She said these seemingly small changes in the data can impact the quality of the AI performance over time.

To address this, she said health systems need to have a way to assess algorithms over time and will likely lead to the creation of new AI administration positions to QA these systems and lead efforts to assess effectiveness of installed and new algorithms being looked at for adoption.

But in order to measure something, you need to know how a technology works at its baseline. Many AI algorithms are a so-called "black box" where data come in and (mostly) correct answers come out, but how the AI reaches that decision is not always clear. This has led to a push in radiology for transparency in the foundational models of AI that is being adopted. Kottler and other AI experts speaking at RSNA said companies will need to find ways to show how the AI performs its thought process so it can be tested at health systems to monitor drift and ensure the accuracy of the AI is maintained over time and with new data inputs.

"As the implantation of AI in departments progresses, what has become clear is that legacy systems we are used to, such as scanners PACS and RIS systems, where built in a time when AI was not a thing. What is becoming clear is that the AI technology does not perform the same over time, and departments need to monitor the performance. We change scanners and protocols over time, so even though the AI product may have worked well when you first put it in, you really need to keep an eye on that over time," explained Christoph Wald, MD, PhD, MBA, FACR, vice chair of the ACR Board of Chancellors and chair of the ACR Commission on Informatics.

Dave Fornell is a digital editor with Cardiovascular Business and Radiology Business magazines. He has been covering healthcare for more than 16 years.

Dave Fornell has covered healthcare for more than 17 years, with a focus in cardiology and radiology. Fornell is a 5-time winner of a Jesse H. Neal Award, the most prestigious editorial honors in the field of specialized journalism. The wins included best technical content, best use of social media and best COVID-19 coverage. Fornell was also a three-time Neal finalist for best range of work by a single author. He produces more than 100 editorial videos each year, most of them interviews with key opinion leaders in medicine. He also writes technical articles, covers key trends, conducts video hospital site visits, and is very involved with social media. E-mail: dfornell@innovatehealthcare.com

Around the web

California-based Acutus Medical has said its ongoing agreement to manufacture and distribute left-heart access devices for Medtronic is the company's only source of revenue. 

The scam took place over a period of seven years, resulting in Medicare being billed for more than $70 million in fraudulent claims for unnecessary scans. 

Compensation for heart specialists continues to climb. What does this say about cardiology as a whole? Could private equity's rising influence bring about change? We spoke to MedAxiom CEO Jerry Blackwell, MD, MBA, a veteran cardiologist himself, to learn more.