AI earns high marks for evaluating x-rays in ED setting
Deep learning algorithms can be trained to flag suspicious chest x-rays in an emergency department (ED) setting, according to new research published in Radiology.
“For DL algorithms to be clinically useful in medical imaging, their performance should be validated in a study sample that reflects clinical applications of this new technology,” wrote Eui Jin Hwang, department of radiology at Seoul National University College of Medicine in Korea, and colleagues. “Thus, the purpose of our study was to evaluate the performance of a DL algorithm in the identification of chest radiographs with clinically relevant abnormalities in the ED setting.”
The authors used a previously developed deep learning algorithm to analyze data from more than 1,000 consecutive patients who visited a single ED and underwent chest x-rays from Jan. 1 to March 31, 2017. The algorithm’s performance was then compared to that of a group of on-call radiology residents, who interpreted the imaging findings as they normally would.
Overall, the team found that the algorithm achieved an area under the ROC curve (AUC) of 0.95 for detecting relevant abnormalities. It had a sensitivity of 88.7% and specificity of 69.6% at the team’s chosen high-sensitivity cutoff (a probability score of 0.16). The sensitivity was 81.6% and specificity was 90.3% at the team’s chosen high-specificity cutoff (a probability score of 0.46).
The residents, meanwhile, had a higher specificity than the algorithm and a lower sensitivity—but when using the algorithm’s output, their sensitivity did increase.
“The algorithm showed high efficacy in the classification of radiographs with clinically relevant abnormalities from the ED in this ad hoc retrospective review,” the authors wrote. “This suggests that this deep learning algorithm is ready for further testing in a controlled real-time ED setting.”
In addition, the authors noted, algorithms such as the one they evaluated could make a significant difference when it comes to screening or triaging patients.
“During the study period, the interval between image acquisition and reporting was paradoxically longer in radiographs with relevant abnormalities,” Hwang et al. wrote. “In this regard, the algorithm may improve clinical workflow in the ED by screening radiographs before interpretation by ED physicians and radiologists. The algorithm can inform physicians and radiologists if there is a high probability of relevant disease necessitating timely diagnosis and management.”