GPT-4, a new upgrade from the team behind ChatGPT, can help doctors with difficult diagnoses
GPT-4, a new ChatGPT-like language model from OpenAI, can assist clinicians by suggesting different diagnoses and providing valuable insights, according to new research published in JAMA Network Open.[1] The study represents yet another piece of evidence that artificial intelligence (AI) is destined to change patient care forever.
A team of researchers from the division of geriatrics at Queen Mary Hospital in Hong Kong focused on utilizing GPT-4, a new AI-powered language model designed to be “more creative and collaborative” than ChatGPT, to analyze the medical histories of patients aged 65 or older who experienced delays of more than a month in receiving a definitive diagnosis.
“AI, especially machine learning, has been increasingly used in diagnosing conditions such as skin or breast cancer and Alzheimer disease,” wrote first author Yat-Fung Shea, MBBS, and colleagues. “However, AI relies on clinical imaging. In low-income countries, where specialist care may be lacking, AI may be useful for making clinical diagnoses.”
The group used GPT-4 to evaluate the medical histories of six patients at the time of admission, one week post-admission and before the final diagnosis. They found that GPT-4's primary diagnoses were accurate for four out of six patients, surpassing the accuracy of diagnoses made by human clinicians, who were only accurate for two patients. A separate traditional medical diagnostic decision support system used in the analysis, Isabel DDx Companion, did not provide accurate diagnoses for any patients.
In addition, when considering both primary and differential diagnoses, GPT-4's accuracy rose to an impressive five out of six patients.
“GPT-4 may increase confidence in diagnosis and earlier commencement of appropriate treatment, alert clinicians missing important diagnoses, and offer suggestions similar to specialists to achieve the correct clinical diagnosis, which has potential value in low-income countries with lack of specialist care,” the authors wrote.
GPT-4's effectiveness relied on the comprehensive entry of patient data, including demographic, clinical, radiological, and pharmacological information. One key limitation, the authors added, was that some of its recommendations were not medically appropriate. However, they still saw significant potential in its capabilities.
“Overall, our findings suggest that the use of AI in diagnosis is both promising and challenging,” the authors concluded.
The study, “Use of GPT-4 to Analyze Medical Records of Patients With Extensive Investigations and Delayed Diagnosis,” is available here.