P107 FOOD ALLERGY AND INFORMATICS: USING NATURAL LANGUAGE PROCESSING TO IDENTIFY CLINICAL PREDICTORS IN PROGRESS NOTES

2021 ◽  
Vol 127 (5) ◽  
pp. S41-S42
Author(s):  
L. Bilaver ◽  
H. Wang ◽  
A. Naidech ◽  
Y. Luo ◽  
R. Das ◽  
...  
JAMIA Open ◽  
2019 ◽  
Vol 2 (1) ◽  
pp. 139-149 ◽  
Author(s):  
Meijian Guan ◽  
Samuel Cho ◽  
Robin Petro ◽  
Wei Zhang ◽  
Boris Pasche ◽  
...  

Abstract Objectives Natural language processing (NLP) and machine learning approaches were used to build classifiers to identify genomic-related treatment changes in the free-text visit progress notes of cancer patients. Methods We obtained 5889 deidentified progress reports (2439 words on average) for 755 cancer patients who have undergone a clinical next generation sequencing (NGS) testing in Wake Forest Baptist Comprehensive Cancer Center for our data analyses. An NLP system was implemented to process the free-text data and extract NGS-related information. Three types of recurrent neural network (RNN) namely, gated recurrent unit, long short-term memory (LSTM), and bidirectional LSTM (LSTM_Bi) were applied to classify documents to the treatment-change and no-treatment-change groups. Further, we compared the performances of RNNs to 5 machine learning algorithms including Naive Bayes, K-nearest Neighbor, Support Vector Machine for classification, Random forest, and Logistic Regression. Results Our results suggested that, overall, RNNs outperformed traditional machine learning algorithms, and LSTM_Bi showed the best performance among the RNNs in terms of accuracy, precision, recall, and F1 score. In addition, pretrained word embedding can improve the accuracy of LSTM by 3.4% and reduce the training time by more than 60%. Discussion and Conclusion NLP and RNN-based text mining solutions have demonstrated advantages in information retrieval and document classification tasks for unstructured clinical progress notes.


2021 ◽  
Vol 23 (2) ◽  
pp. 144-153
Author(s):  
Marcus Young ◽  
◽  
Natasha Holmes ◽  
Raymond Robbins ◽  
Nada Marhoon ◽  
...  

Background: There is no gold standard approach for delirium diagnosis, making the assessment of its epidemiology difficult. Delirium can only be inferred though observation of behavioural disturbance and described with relevant nouns or adjectives. Objective: We aimed to use natural language processing (NLP) and its identification of words descriptive of behavioural disturbance to study the epidemiology of delirium in critically ill patients. Study design: Retrospective study using data collected from the electronic health records of a university-affiliated intensive care unit (ICU) in Melbourne, Australia. Participants: 12 375 patients Intervention: Analysis of electronic progress notes. Identification using NLP of at least one of a list of words describing behavioural disturbance within such notes. Results: We analysed 199 648 progress notes in 12 375 patients. Of these, 5108 patients (41.3%) had NLP-diagnosed behavioural disturbance (NLP-Dx-BD). Compared with those who did not have NLP-Dx-DB, these patients were older, more severely ill, and likely to have medical or unplanned admissions, neurological diagnosis, chronic kidney or liver disease and to receive mechanical ventilation and renal replacement therapy (P < 0.001). The unadjusted hospital mortality for NLP-Dx-BD patients was 14.1% versus 9.6% for patients without NLP-Dx-BD. After adjustment for baseline characteristics and illness severity, NLP-Dx-BD was not associated with increased risk of death (odds ratio [OR], 0.94; 95% CI, 0.80–1.10); a finding robust to multiple sensitivity, subgroups and time of observation subcohort analyses. In mechanically ventilated patients, NLP-Dx-BD was associated with decreased hospital mortality (OR, 0.80; 95% CI, 0.65–0.99) after adjustment for baseline severity of illness and year of admission. Conclusions: NLP enabled rapid assessment of large amounts of data identifying a population of ICU patients with typical high risk characteristics for delirium. Moreover, this technique enabled identification of previously poorly understood associations. Further investigations of this technique appear justified.


2020 ◽  
Author(s):  
Kenneth L. Kehl ◽  
Wenxin Xu ◽  
Haitham A. Elmarakeby ◽  
Michael J. Hassett ◽  
Jackson Nyman ◽  
...  

2020 ◽  
pp. 3-17
Author(s):  
Peter Nabende

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.


Diabetes ◽  
2019 ◽  
Vol 68 (Supplement 1) ◽  
pp. 1243-P
Author(s):  
JIANMIN WU ◽  
FRITHA J. MORRISON ◽  
ZHENXIANG ZHAO ◽  
XUANYAO HE ◽  
MARIA SHUBINA ◽  
...  

Author(s):  
Pamela Rogalski ◽  
Eric Mikulin ◽  
Deborah Tihanyi

In 2018, we overheard many CEEA-AGEC members stating that they have "found their people"; this led us to wonder what makes this evolving community unique. Using cultural historical activity theory to view the proceedings of CEEA-ACEG 2004-2018 in comparison with the geographically and intellectually adjacent ASEE, we used both machine-driven (Natural Language Processing, NLP) and human-driven (literature review of the proceedings) methods. Here, we hoped to build on surveys—most recently by Nelson and Brennan (2018)—to understand, beyond what members say about themselves, what makes the CEEA-AGEC community distinct, where it has come from, and where it is going. Engaging in the two methods of data collection quickly diverted our focus from an analysis of the data themselves to the characteristics of the data in terms of cultural historical activity theory. Our preliminary findings point to some unique characteristics of machine- and human-driven results, with the former, as might be expected, focusing on the micro-level (words and language patterns) and the latter on the macro-level (ideas and concepts). NLP generated data within the realms of "community" and "division of labour" while the review of proceedings centred on "subject" and "object"; both found "instruments," although NLP with greater granularity. With this new understanding of the relative strengths of each method, we have a revised framework for addressing our original question.  


2020 ◽  
Author(s):  
Vadim V. Korolev ◽  
Artem Mitrofanov ◽  
Kirill Karpov ◽  
Valery Tkachenko

The main advantage of modern natural language processing methods is a possibility to turn an amorphous human-readable task into a strict mathematic form. That allows to extract chemical data and insights from articles and to find new semantic relations. We propose a universal engine for processing chemical and biological texts. We successfully tested it on various use-cases and applied to a case of searching a therapeutic agent for a COVID-19 disease by analyzing PubMed archive.


Sign in / Sign up

Export Citation Format

Share Document