Mute AI not to be trusted to help make imaging-based diagnoses; explainable AI, fire away

Black-box AI should be barred from reading medical images in clinical settings because machine learning, like human thinking, tends to take diagnostic shortcuts—which for basic safety reasons call for an explanation.

A study published May 31 in Nature Machine Intelligence bears this out.

Researchers at the University of Washington in Seattle began their investigation by reviewing the literature to assess datasets and AI models used for diagnosing COVID-19 from chest X-rays.

Su-In Lee, PhD, and colleagues paid special attention to studies using AI approaches they deemed at high probability for “worst-case confounding.”

An example of this effect is tending to assume elderly patients are COVID-positive when they have, say, a fever and sore throat but inconclusive findings on chest imaging.

To uncover such shortcutting, Lee and team first trained deep convolutional neural networks on image datasets resembling those used in the published studies.

Next they tested the models on COVID case mockups representing both single-hospital and multi-institution settings.

“[A] model that relies on valid medical pathology—which should not change between datasets—should maintain high performance,” the authors point out.

Unsurprisingly, the single-site application far outperformed its multisite counterpart.

However, both had evidence of shortcutting.

The worst performance turned up when models were made to synthesize training data from separate datasets.

Such synthesis “introduces near worst-case confounding and thus abundant opportunity for models to learn these [inappropriate] shortcuts,” Lee and co-authors comment.

“Importantly,” they add, “because undesirable ‘shortcuts’ may be consistently detected in both internal and external domains, our results warn that external test set validation alone may be insufficient to detect poorly behaved models.”

The results also buttress the case for using explainable AI—and for now only explainable AI—in use cases across clinical settings, the authors underscore.

In coverage from UW’s news division, Lee says she and her team remain hopeful about AI’s future in imaging-based medical diagnostics.

“I believe we will eventually have reliable ways to prevent AI from learning shortcuts, but it’s going to take some more work to get there,” she says. “Going forward, explainable AI is going to be an essential tool for ensuring these models can be used safely and effectively to augment medical decisionmaking and achieve better outcomes for patients.”

UW’s coverage is here, and the study is available in full for free.

Dave Pearson

Dave P. has worked in journalism, marketing and public relations for more than 30 years, frequently concentrating on hospitals, healthcare technology and Catholic communications. He has also specialized in fundraising communications, ghostwriting for CEOs of local, national and global charities, nonprofits and foundations.

Around the web

The tirzepatide shortage that first began in 2022 has been resolved. Drug companies distributing compounded versions of the popular drug now have two to three more months to distribute their remaining supply.

The 24 members of the House Task Force on AI—12 reps from each party—have posted a 253-page report detailing their bipartisan vision for encouraging innovation while minimizing risks. 

Merck sent Hansoh Pharma, a Chinese biopharmaceutical company, an upfront payment of $112 million to license a new investigational GLP-1 receptor agonist. There could be many more payments to come if certain milestones are met.