How NLP and Machine Learning Power Healthcare Analytics

Natural Language Processing in Healthcare


Natural language processing (NLP) is a type of AI that can read, decipher, understand, and make sense of human language. With NLP, machines can perform tasks like translation, speech recognition, topic classification, keyword extraction and more.


Using NLP to process healthcare language is more complicated than most applications due to the esoteric nature of medical terminology and the wide variety and inconsistency of documentation standards of clinical concepts. Clinical Natural Language Processing (CNLP) is trained using deep learning techniques to understand the vast nuances in the language of healthcare and the complexity of clinical conditions.


One of the most common functions of NLP in healthcare today is to identify and extract relevant clinical information from medical records, and convert the results into an industry recognized standard, so that it can be combined with structured data for analysis tools. Patient chart digitization is possible thanks to Optical Character Recognition (OCR), a technique that can translate handwritten or printed material into machine-readable text. OCR renders NLP techniques that much more effective by widening the scope of data available to mine for analysis.


Machine Learning in Healthcare


Another key building block of AI, called machine learning (ML), is used to automate NLP and other processes. Machine learning is the process of applying algorithms to teach machines how to learn automatically and improve from experience without being explicitly programmed. Extensive amounts of input data are used to train ML models to develop predictions and insights without human intervention. ML models adapt every time they comb through the data and finds new patterns.  This learning process enables increasingly accurate and robust outputs.


Machine learning has been applied to a wide range of challenges and opportunities in healthcare including clinical decision support, precision medicine, risk stratification, disease progression modeling and subtype discovery.


NLP and Machine Learning for More Accurate Disease Identification


Much of the focus on ML in healthcare so far has been related to disease identification and diagnosis.  A combination of NLP and ML algorithms can identify suspected conditions, which are possible diagnoses that are indicated in clinical data but remain undocumented either because they were coded incorrectly or not at all. ML can be trained to recognize links between various value sets of an underlying condition once NLP has translated the data into standard formats. For example, a diagnosis of hypertension, combined with a recent test result that indicates an A1C of 6.7% and a prescription for insulin would strongly indicate that a patient has diabetes. Or a diagnosis of sleep apnea and obesity, a claim for oxygen delivery and a prescription for Flovent may indicate a patient has COPD. These algorithms can also take limiting factors into account such as whether or not a medication has off-label uses not associated with particular conditions and the age of the supporting evidence.


AI has progressed to the point that it can predict possible conditions even without historical evidence or before supporting documentation exists, based on other health attributes and demographic indicators. Predictive modeling is a technique that detects the statistical probability of additional diagnoses based on a comparison to patients with similar backgrounds and conditions. These algorithms look for similarities among patients with a shared condition, and review the similarities in context of the patient’s profile to determine if the shared traits are predictive.


AI can also predict and prioritize suggestions with a higher likelihood of confirmation. This method typically assigns confidence scores that reflect the statistical probability that a chart review and/or follow-up appointment will result in a confirmation that the suggestion is accurate. Coders or physicians can then review the data to determine if clinical documentation is sufficient to support the suggestion.

shade