Machine learning sees diseases obscured by info overload

Harvard researchers have demonstrated a way to cut through tangles of irrelevant information in electronic health records (EHRs) while applying machine learning to spot patterns indicative of specific disease markers.

The team, led by Hossein Estiri, PhD, of the Mass General Laboratory of Computer Science, details the work in a study posted in Patterns, an open-access journal published by Cell Press.

In their project overview, the authors point out that billions of dollars have been spent trying to wring value out of EHRs. Yet the systems remain too complex to mine for modeling diseases and outcomes without human involvement.

The approach developed by Estiri and colleagues combines a sequential pattern-mining algorithm with a machine learning pipeline. The combination “can be rapidly deployed to develop computational models for identifying and validating novel disease markers and advancing medical knowledge discovery,” they write.

In materials sent to the press by Mass General, the team describes as an example their system’s prediction of heart failure in patients who first had coronary artery disease and then chest pain. Both states were recorded in the EHR, and the experimental approach proved better at predicting heart failure than either of the factors on their own or in a different order.

“The computer sorts through thousands of patients and can find sequences that a physician would likely never identify on their own as relevant but actually are associated with the disease,” Estiri explains.

Mass General adds that the system might help identify patients at risk of developing any number of other diseases and then recommend evaluation by an appropriate specialty.

Dave Pearson

Dave P. has worked in journalism, marketing and public relations for more than 30 years, frequently concentrating on hospitals, healthcare technology and Catholic communications. He has also specialized in fundraising communications, ghostwriting for CEOs of local, national and global charities, nonprofits and foundations.

Around the web

The tirzepatide shortage that first began in 2022 has been resolved. Drug companies distributing compounded versions of the popular drug now have two to three more months to distribute their remaining supply.

The 24 members of the House Task Force on AI—12 reps from each party—have posted a 253-page report detailing their bipartisan vision for encouraging innovation while minimizing risks. 

Merck sent Hansoh Pharma, a Chinese biopharmaceutical company, an upfront payment of $112 million to license a new investigational GLP-1 receptor agonist. There could be many more payments to come if certain milestones are met.