Natural Language Processing for Rapid Response to Emergent Diseases: Case Study of Calcium Channel Blockers and Hypertension in the COVID-19 Pandemic

Background A novel disease poses special challenges for informatics solutions. Biomedical informatics relies for the most part on structured data, which require a preexisting data or knowledge model; however, novel diseases do not have preexisting knowledge models. In an emergent epidemic, language processing can enable rapid conversion of unstructured text to a novel knowledge model. However, although this idea has often been suggested, no opportunity has arisen to actually test it in real time. The current coronavirus disease (COVID-19) pandemic presents such an opportunity. Objective The aim of this study was to evaluate the added value of information from clinical text in response to emergent diseases using natural language processing (NLP). Methods We explored the effects of long-term treatment by calcium channel blockers on the outcomes of COVID-19 infection in patients with high blood pressure during in-patient hospital stays using two sources of information: data available strictly from structured electronic health records (EHRs) and data available through structured EHRs and text mining. Results In this multicenter study involving 39 hospitals, text mining increased the statistical power sufficiently to change a negative result for an adjusted hazard ratio to a positive one. Compared to the baseline structured data, the number of patients available for inclusion in the study increased by 2.95 times, the amount of available information on medications increased by 7.2 times, and the amount of additional phenotypic information increased by 11.9 times. Conclusions In our study, use of calcium channel blockers was associated with decreased in-hospital mortality in patients with COVID-19 infection. This finding was obtained by quickly adapting an NLP pipeline to the domain of the novel disease; the adapted pipeline still performed sufficiently to extract useful information. When that information was used to supplement existing structured data, the sample size could be increased sufficiently to see treatment effects that were not previously statistically detectable.

Download Full-text

Natural Language Processing for Rapid Response to Emergent Diseases: Case Study of Calcium Channel Blockers and Hypertension in the COVID-19 Pandemic (Preprint)

10.2196/preprints.20773 ◽

2020 ◽

Author(s):

Antoine Neuraz ◽

Ivan Lerner ◽

William Digan ◽

Nicolas Paris ◽

Rosy Tsopra ◽

...

Keyword(s):

Natural Language Processing ◽

Text Mining ◽

Natural Language ◽

Calcium Channel ◽

Language Processing ◽

Calcium Channel Blockers ◽

Structured Data ◽

Channel Blockers ◽

Knowledge Model ◽

Long Term Treatment

BACKGROUND A novel disease poses special challenges for informatics solutions. Biomedical informatics relies for the most part on structured data, which require a preexisting data or knowledge model; however, novel diseases do not have preexisting knowledge models. In an emergent epidemic, language processing can enable rapid conversion of unstructured text to a novel knowledge model. However, although this idea has often been suggested, no opportunity has arisen to actually test it in real time. The current coronavirus disease (COVID-19) pandemic presents such an opportunity. OBJECTIVE The aim of this study was to evaluate the added value of information from clinical text in response to emergent diseases using natural language processing (NLP). METHODS We explored the effects of long-term treatment by calcium channel blockers on the outcomes of COVID-19 infection in patients with high blood pressure during in-patient hospital stays using two sources of information: data available strictly from structured electronic health records (EHRs) and data available through structured EHRs and text mining. RESULTS In this multicenter study involving 39 hospitals, text mining increased the statistical power sufficiently to change a negative result for an adjusted hazard ratio to a positive one. Compared to the baseline structured data, the number of patients available for inclusion in the study increased by 2.95 times, the amount of available information on medications increased by 7.2 times, and the amount of additional phenotypic information increased by 11.9 times. CONCLUSIONS In our study, use of calcium channel blockers was associated with decreased in-hospital mortality in patients with COVID-19 infection. This finding was obtained by quickly adapting an NLP pipeline to the domain of the novel disease; the adapted pipeline still performed sufficiently to extract useful information. When that information was used to supplement existing structured data, the sample size could be increased sufficiently to see treatment effects that were not previously statistically detectable.

Download Full-text

Does higher education properly prepare graduates for the growing artificial intelligence market? Gaps identification using text mining

Human Systems Management ◽

10.3233/hsm-211179 ◽

2021 ◽

pp. 1-13

Author(s):

Lamiae Benhayoun ◽

Daniel Lang

Keyword(s):

Artificial Intelligence ◽

Natural Language Processing ◽

Text Mining ◽

Natural Language ◽

Language Processing ◽

Academic Training ◽

Market Requirements ◽

Job Advertisements ◽

The Individual

BACKGROUND: The renewed advent of Artificial Intelligence (AI) is inducing profound changes in the classic categories of technology professions and is creating the need for new specific skills. OBJECTIVE: Identify the gaps in terms of skills between academic training on AI in French engineering and Business Schools, and the requirements of the labour market. METHOD: Extraction of AI training contents from the schools’ websites and scraping of a job advertisements’ website. Then, analysis based on a text mining approach with a Python code for Natural Language Processing. RESULTS: Categorization of occupations related to AI. Characterization of three classes of skills for the AI market: Technical, Soft and Interdisciplinary. Skills’ gaps concern some professional certifications and the mastery of specific tools, research abilities, and awareness of ethical and regulatory dimensions of AI. CONCLUSIONS: A deep analysis using algorithms for Natural Language Processing. Results that provide a better understanding of the AI capability components at the individual and the organizational levels. A study that can help shape educational programs to respond to the AI market requirements.

Download Full-text

Identifying Causality and Contributory Factors of Pipeline incidents by Employing Natural Language Processing and Text Mining Techniques

Process Safety and Environmental Protection ◽

10.1016/j.psep.2021.05.036 ◽

2021 ◽

Author(s):

Guanyang Liu ◽

Mason Boyd ◽

Mengxi Yu ◽

S. Zohra Halim ◽

Noor Quddus

Keyword(s):

Natural Language Processing ◽

Text Mining ◽

Natural Language ◽

Language Processing ◽

Contributory Factors

Download Full-text

Application of Natural Language Processing and Text Mining to Identify Patterns in Construction-Defect Litigation Cases

Journal of Legal Affairs and Dispute Resolution in Engineering and Construction ◽

10.1061/(asce)la.1943-4170.0000308 ◽

2019 ◽

Vol 11 (4) ◽

pp. 04519024 ◽

Cited By ~ 4

Author(s):

Yashovardhan Jallan ◽

Elizabeth Brogan ◽

Baabak Ashuri ◽

Caroline M. Clevenger

Keyword(s):

Natural Language Processing ◽

Text Mining ◽

Natural Language ◽

Language Processing

Download Full-text

Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation (Preprint)

10.2196/preprints.17376 ◽

2019 ◽

Cited By ~ 1

Author(s):

Sijia Liu ◽

Yanshan Wang ◽

Andrew Wen ◽

Liwei Wang ◽

Na Hong ◽

...

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Data Model ◽

Structured Data ◽

Common Data Model ◽

Concept System ◽

Unstructured Text ◽

Electronic Health

BACKGROUND Widespread adoption of electronic health records has enabled the secondary use of electronic health record data for clinical research and health care delivery. Natural language processing techniques have shown promise in their capability to extract the information embedded in unstructured clinical data, and information retrieval techniques provide flexible and scalable solutions that can augment natural language processing systems for retrieving and ranking relevant records. OBJECTIVE In this paper, we present the implementation of a cohort retrieval system that can execute textual cohort selection queries on both structured data and unstructured text—Cohort Retrieval Enhanced by Analysis of Text from Electronic Health Records (CREATE). METHODS CREATE is a proof-of-concept system that leverages a combination of structured queries and information retrieval techniques on natural language processing results to improve cohort retrieval performance using the Observational Medical Outcomes Partnership Common Data Model to enhance model portability. The natural language processing component was used to extract common data model concepts from textual queries. We designed a hierarchical index to support the common data model concept search utilizing information retrieval techniques and frameworks. RESULTS Our case study on 5 cohort identification queries, evaluated using the precision at 5 information retrieval metric at both the patient-level and document-level, demonstrates that CREATE achieves a mean precision at 5 of 0.90, which outperforms systems using only structured data or only unstructured text with mean precision at 5 values of 0.54 and 0.74, respectively. CONCLUSIONS The implementation and evaluation of Mayo Clinic Biobank data demonstrated that CREATE outperforms cohort retrieval systems that only use one of either structured data or unstructured text in complex textual cohort queries.

Download Full-text