JACR: A natural language processing primer

As the victory of IBM’s supercomputer Watson on Jeopardy highlighted, natural language processing (NLP) has made great strides since its introduction in the 1950s, with radiologists setting their sights on the technology’s approaching clinical applications, outlined the authors of a June commentary in the Journal of the American College of Radiology.

NLP made its debut in medicine in the mid-1960s via an automated psychotherapist named ELIZA. The challenging role played by ELIZA was for the computer, whose technology was grounded in a database of keywords, to participate in a conversation with a human that made the computer indistinguishable from a person, explained Ronilda Lacson, MD, PhD, and Ramin Khorasani, MD, MPH, from Brigham and Women’s Hospital in Boston.

“Currently, NLP is defined as ‘a theoretically motivated range of computational techniques for analyzing and representing naturally occurring texts at one or more levels of linguistic analysis for the purpose of achieving human-like language processing for a range of tasks or applications,’” Lacson and Khorasani offered.

NLP has become a field of study focused on understanding the full meaning of written or spoken text. NLP systems and algorithms integrate concepts and methods from a variety of domains, the authors pointed out, including computer science, linguistics, psychology, information theory, mathematics and statistics.

Under this direction, NLP consists of several levels of language analysis, Lacson and Khorasani explained:

  • Morphological knowledge—How words are constructed from basic units or morphemes. “The nodular is smaller,” with the two morphemes ‘small’ and ‘er’ (suffix), conveys a comparison of the root word ‘nodule.’
  • Lexical knowledge—References the meaning of individual words, which software can delineate with the word sense or parts of speech.
  • Syntax knowledge—The structuring of words within a sentence.
  • Semantic knowledge—The way in which the meanings of individual words combine to form the meaning of a sentence.
  • Discourse knowledge—Understanding text from adjacent sentences. This can include anaphora resolution, wherein a pronoun is known to refer to a previous sentence.
  • Pragmatic knowledge—Apprehension of sentences in various contexts, where world knowledge is invoked, or a user’s goals and beliefs, to grasp abstract or non-literal meanings.
 
The authors noted that although a clear conceptual sense of these analytic tools can be found in research, much work remains to be done to incorporate into software these more complex interpretations of discourse knowledge and context. “We will therefore continue to work on demonstrating and validating meaningful use of NLP systems, while further elucidating the role of NLP in radiology,” concluded Lacson and Khorasani.

Around the web

The American College of Cardiology has shared its perspective on new CMS payment policies, highlighting revenue concerns while providing key details for cardiologists and other cardiology professionals. 

As debate simmers over how best to regulate AI, experts continue to offer guidance on where to start, how to proceed and what to emphasize. A new resource models its recommendations on what its authors call the “SETO Loop.”

FDA Commissioner Robert Califf, MD, said the clinical community needs to combat health misinformation at a grassroots level. He warned that patients are immersed in a "sea of misinformation without a compass."

Trimed Popup
Trimed Popup