scholarly journals Identification of prediabetes discussions in unstructured clinical documentation using natural language processing methods (Preprint)

10.2196/29803 ◽  
2021 ◽  
Author(s):  
Jessica Schwartz ◽  
Eva Tseng ◽  
Nisa M Maruthur ◽  
Masoud Rouhizadeh
2020 ◽  
Author(s):  
Vadim V. Korolev ◽  
Artem Mitrofanov ◽  
Kirill Karpov ◽  
Valery Tkachenko

The main advantage of modern natural language processing methods is a possibility to turn an amorphous human-readable task into a strict mathematic form. That allows to extract chemical data and insights from articles and to find new semantic relations. We propose a universal engine for processing chemical and biological texts. We successfully tested it on various use-cases and applied to a case of searching a therapeutic agent for a COVID-19 disease by analyzing PubMed archive.


2020 ◽  
Vol 7 (Supplement_1) ◽  
pp. S690-S691
Author(s):  
Joshua C Herigon ◽  
Amir Kimia ◽  
Marvin Harper

Abstract Background Antibiotics are the most commonly prescribed drugs for children and frequently inappropriately prescribed. Outpatient antimicrobial stewardship interventions aim to reduce inappropriate antibiotic use. Previous work has relied on diagnosis coding for case identification which may be inaccurate. In this study, we sought to develop automated methods for analyzing note text to identify cases of acute otitis media (AOM) based on clinical documentation. Methods We conducted a cross-sectional retrospective chart review and sampled encounters from 7/1/2018 – 6/30/2019 for patients < 5 years old presenting for a problem-focused visit. Complete note text and limited structured data were extracted for 12 randomly selected weekdays (one from each month during the study period). An additional weekday was randomly selected for validation. The primary outcome was correctly identifying encounters where AOM was present. Human review was considered the “gold standard” and was compared to ICD codes, a natural language processing (NLP) model, and a recursive partitioning (RP) model. Results A total of 2,724 encounters were included in the training cohort and 793 in the validation cohort. ICD codes and NLP had good performance overall with sensitivity 91.2% and 93.1% respectively in the training cohort. However, NLP had a significant drop-off in performance in the validation cohort (sensitivity: 83.4%). The RP model had the highest sensitivity (97.2% training cohort; 94.1% validation cohort) out of the 3 methods. Figure 1. Details of encounters included in the training and validation cohorts. Table 1. Performance of ICD coding, a natural language processing (NLP) model, and a recursive partitioning (RP) model for identifying cases of acute otitis media (AOM) Conclusion Natural language processing of outpatient pediatric visit documentation can be used successfully to create models accurately identifying cases of AOM based on clinical documentation. Combining NLP and structured data can improve automated case detection, leading to more accurate assessment of antibiotic prescribing practices. These techniques may be valuable in optimizing outpatient antimicrobial stewardship efforts. Disclosures All Authors: No reported disclosures


2021 ◽  
Vol 2 ◽  
Author(s):  
Denis Newman-Griffis ◽  
Jonathan Camacho Maldonado ◽  
Pei-Shu Ho ◽  
Maryanne Sacco ◽  
Rafael Jimenez Silva ◽  
...  

Background: Invaluable information on patient functioning and the complex interactions that define it is recorded in free text portions of the Electronic Health Record (EHR). Leveraging this information to improve clinical decision-making and conduct research requires natural language processing (NLP) technologies to identify and organize the information recorded in clinical documentation.Methods: We used natural language processing methods to analyze information about patient functioning recorded in two collections of clinical documents pertaining to claims for federal disability benefits from the U.S. Social Security Administration (SSA). We grounded our analysis in the International Classification of Functioning, Disability, and Health (ICF), and used the Activities and Participation domain of the ICF to classify information about functioning in three key areas: mobility, self-care, and domestic life. After annotating functional status information in our datasets through expert clinical review, we trained machine learning-based NLP models to automatically assign ICF categories to mentions of functional activity.Results: We found that rich and diverse information on patient functioning was documented in the free text records. Annotation of 289 documents for Mobility information yielded 2,455 mentions of Mobility activities and 3,176 specific actions corresponding to 13 ICF-based categories. Annotation of 329 documents for Self-Care and Domestic Life information yielded 3,990 activity mentions and 4,665 specific actions corresponding to 16 ICF-based categories. NLP systems for automated ICF coding achieved over 80% macro-averaged F-measure on both datasets, indicating strong performance across all ICF categories used.Conclusions: Natural language processing can help to navigate the tradeoff between flexible and expressive clinical documentation of functioning and standardizable data for comparability and learning. The ICF has practical limitations for classifying functional status information in clinical documentation but presents a valuable framework for organizing the information recorded in health records about patient functioning. This study advances the development of robust, ICF-based NLP technologies to analyze information on patient functioning and has significant implications for NLP-powered analysis of functional status information in disability benefits management, clinical care, and research.


2019 ◽  
Vol 2 (8) ◽  
pp. e1910399
Author(s):  
Meliha Skaljic ◽  
Ihsaan H. Patel ◽  
Amelia M. Pellegrini ◽  
Victor M. Castro ◽  
Roy H. Perlis ◽  
...  

2021 ◽  
Author(s):  
Denis R Newman-Griffis ◽  
Jonathan Camacho Maldonado ◽  
Pei-Shu Ho ◽  
Maryanne Sacco ◽  
Rafael Jimenez Silva ◽  
...  

Background: Invaluable information on patient functioning and the complex interactions that define it is recorded in free text portions of the Electronic Health Record (EHR). Leveraging this information to improve clinical decision-making and conduct research requires natural language processing (NLP) technologies to identify and organize the information recorded in clinical documentation. Methods: We used NLP methods to analyze information about patient functioning recorded in two collections of clinical documents pertaining to claims for federal disability benefits from the U.S. Social Security Administration (SSA). We grounded our analysis in the International Classification of Functioning, Disability and Health (ICF), and used the ICF's Activities and Participation domain to classify information about functioning in three key areas: Mobility, Self-Care, and Domestic Life. After annotating functional status information in our datasets through expert clinical review, we trained machine learning-based NLP models to automatically assign ICF codes to mentions of functional activity. Results: We found that rich and diverse information on patient functioning was documented in the free text records. Annotation of 289 documents for Mobility information yielded 2,455 mentions of Mobility activities and 3,176 specific actions corresponding to 13 ICF-based codes. Annotation of 329 documents for Self-Care and Domestic Life information yielded 3,990 activity mentions and 4,665 specific actions corresponding to 16 ICF-based codes. NLP systems for automated ICF coding achieved over 80% macro-averaged F-measure on both datasets, indicating strong performance across all ICF codes used. Conclusions: NLP can help to navigate the tradeoff between flexible and expressive clinical documentation of functioning and standardizable data for comparability and learning. The ICF has practical limitations for classifying functional status information in clinical documentation, but presents a valuable framework for organizing the information recorded in health records about patient functioning. This study advances the development of robust, ICF-based NLP technologies to analyze information on patient functioning, and has significant implications for NLP-powered analysis of functional status information in disability benefits management, clinical care, and research.


Author(s):  
Maitri Patel and Dr Hemant D Vasava

Data,Information or knoweldge,in this rapidly moving and growing world.we can find any kind of information on Internet.And this can be too useful,however for acedemic world too it is useful but along with it plagarism is highly in practice.Which makes orginality of work degrade and fraudly using someones original work and later not acknowleging them is becoming common.And some times teachers or professors could not identify the plagarised information provided.So higher educational systems nowadays use different types of tools to compare.Here we have an idea to match no of different documents like assignments of students to compare with each other to find out, did they copied each other’s work?Also an idea to compare ideal answeer sheet of particular subject examination to similar test sheets of students.Idea is to compare and on similarity basis we can rank them.Both approach is one kind and that is to compare documents.To identify plagarism there are many methods used already.So we could compare and develop them if needed.


2015 ◽  
Vol 23 (3) ◽  
pp. 695 ◽  
Author(s):  
Arnaldo Candido Junior ◽  
Célia Magalhães ◽  
Helena Caseli ◽  
Régis Zangirolami

<p style="margin-bottom: 0cm; line-height: 100%;" align="justify"> </p><p>Este artigo tem o objetivo da avaliar a aplicação de dois métodos automáticos eficientes na extração de palavras-chave, usados pelas comunidades da Linguística de <em>Corpus </em>e do Processamento da Língua Natural para gerar palavras-chave de textos literários: o <em>WordSmith Tools </em>e o <em>Latent Dirichlet Allocation </em>(LDA). As duas ferramentas escolhidas para este trabalho têm suas especificidades e técnicas diferentes de extração, o que nos levou a uma análise orientada para a sua performance. Objetivamos entender, então, como cada método funciona e avaliar sua aplicação em textos literários. Para esse fim, usamos análise humana, com conhecimento do campo dos textos usados. O método LDA foi usado para extrair palavras-chave por meio de sua integração com o <em>Portal Min@s: Corpora de Fala e Escrita</em>, um sistema geral de processamento de <em>corpora</em>, concebido para diferentes pesquisas de Linguística de <em>Corpus</em>. Os resultados do experimento confirmam a eficácia do WordSmith Tools e do LDA na extração de palavras-chave de um <em>corpus </em>literário, além de apontar que é necessária a análise humana das listas em um estágio anterior aos experimentos para complementar a lista gerada automaticamente, cruzando os resultados do WordSmith Tools e do LDA. Também indicam que a intuição linguística do analista humano sobre as listas geradas separadamente pelos dois métodos usados neste estudo foi mais favorável ao uso da lista de palavras-chave do WordSmith Tools.</p>


Sign in / Sign up

Export Citation Format

Share Document