Eye AI performs unexceptionally in clinical settings
Detecting diabetic retinopathy on eyeball imaging has been touted as one of medical AI’s most promising applications, as “the machine” has repeatedly equaled or bested humans at the task.
However, most of the action has taken place in research settings. A new study looking at the technology’s competence in real-world care presents some sobering findings.
Led by investigators at UW Medicine in Seattle, the study team compared the performance of seven different algorithms against that of experienced ophthalmologists.
The humans proved more accurate than six of the algorithms, and the seventh managed no better than a tie.
The project’s impact may be considerable thanks to its size, reach and ramifications.
The patients were more than 23,000 veterans who were screened for diabetic retinopathy at VA centers in Washington State and Georgia. The algorithms came from four countries, and the retinal images numbered more than 311,000.
Further, if not properly diagnosed and treated, diabetic retinopathy can lead to serious vision problems, including blindness.
Noting that the algorithms yielded significant performance differences not only versus the physicians but also against one another, the authors conclude the results “argue for rigorous testing of all such algorithms on real-world data before clinical implementation.”
The study report is running in the January edition of Diabetes Care, which is published by the American Diabetes Association.
In coverage posted by UW Medicine’s news operation, lead study author Aaron Lee, MD, comments: “It’s alarming that some of these algorithms are not performing consistently since they are being used somewhere in the world.”