scholarly journals Use of Natural Language Processing to Improve Identification of Patients With Peripheral Artery Disease

Author(s):  
E. Hope Weissler ◽  
Jikai Zhang ◽  
Steven Lippmann ◽  
Shelley Rusincovitch ◽  
Ricardo Henao ◽  
...  

Background: Peripheral artery disease (PAD) is underrecognized, undertreated, and understudied: each of these endeavors requires efficient and accurate identification of patients with PAD. Currently, PAD patient identification relies on diagnosis/procedure codes or lists of patients diagnosed or treated by specific providers in specific locations and ways. The goal of this research was to leverage natural language processing to more accurately identify patients with PAD in an electronic health record system compared with a structured data–based approach. Methods: The clinical notes from a cohort of 6861 patients in our health system whose PAD status had previously been adjudicated were used to train, test, and validate a natural language processing model using 10-fold cross-validation. The performance of this model was described using the area under the receiver operating characteristic and average precision curves; its performance was quantitatively compared with an administrative data–based least absolute shrinkage and selection operator (LASSO) approach using the DeLong test. Results: The median (SD) of the area under the receiver operating characteristic curve for the natural language processing model was 0.888 (0.009) versus 0.801 (0.017) for the LASSO-based approach alone (DeLong P <0.0001). The median (SD) of the area under the precision curve was 0.909 (0.008) versus 0.816 (0.012) for the structured data–based approach. When sensitivity was set at 90%, the precision for LASSO was 65% and the machine learning approach was 74%, while the specificity for LASSO was 41% and for the machine learning approach was 62%. Conclusions: Using a natural language processing approach in addition to partial cohort preprocessing with a LASSO-based model, we were able to meaningfully improve our ability to identify patients with PAD compared with an approach using structured data alone. This model has potential applications to both interventions targeted at improving patient care as well as efficient, large-scale PAD research. Graphic Abstract: A graphic abstract is available for this article.

Author(s):  
Filipe R Lucini ◽  
Karla D Krewulak ◽  
Kirsten M Fiest ◽  
Sean M Bagshaw ◽  
Danny J Zuege ◽  
...  

Abstract Objective To apply natural language processing (NLP) techniques to identify individual events and modes of communication between healthcare professionals and families of critically ill patients from electronic medical records (EMR). Materials and Methods Retrospective cohort study of 280 randomly selected adult patients admitted to 1 of 15 intensive care units (ICU) in Alberta, Canada from June 19, 2012 to June 11, 2018. Individual events and modes of communication were independently abstracted using NLP and manual chart review (reference standard). Preprocessing techniques and 2 NLP approaches (rule-based and machine learning) were evaluated using sensitivity, specificity, and area under the receiver operating characteristic curves (AUROC). Results Over 2700 combinations of NLP methods and hyperparameters were evaluated for each mode of communication using a holdout subset. The rule-based approach had the highest AUROC in 65 datasets compared to the machine learning approach in 21 datasets. Both approaches had similar performance in 17 datasets. The rule-based AUROC for the grouped categories of patient documented to have family or friends (0.972, 95% CI 0.934–1.000), visit by family/friend (0.882 95% CI 0.820–0.943) and phone call with family/friend (0.975, 95% CI: 0.952–0.998) were high. Discussion We report an automated method to quantify communication between healthcare professionals and family members of adult patients from free-text EMRs. A rule-based NLP approach had better overall operating characteristics than a machine learning approach. Conclusion NLP can automatically and accurately measure frequency and mode of documented family visitation and communication from unstructured free-text EMRs, to support patient- and family-centered care initiatives.


IJOSTHE ◽  
2017 ◽  
Vol 5 (1) ◽  
pp. 10
Author(s):  
Rajul Rai ◽  
Pradeep Mewada

With development of Internet and Natural Language processing, use of regional languages is also grown for communication. Sentiment analysis is natural language processing task that extracts useful information from various data forms such as reviews and categorize them on basis of polarity. One of the sub-domain of opinion mining is sentiment analysis which is basically focused on the extraction of emotions and opinions of the people towards a particular topic from textual data. In this paper, sentiment analysis is performed on IMDB movie review database. We examine the sentiment expression to classify the polarity of the movie review on a scale of negative to positive and perform feature extraction and ranking and use these features to train our multilevel classifier to classify the movie review into its correct label. In this paper classification of movie reviews into positive and negative classes with the help of machine learning. Proposed approach using classification techniques has the best accuracy of about 99%.


Sign in / Sign up

Export Citation Format

Share Document