BERT-Based Natural Language Processing of Drug Labeling Documents: A Case Study for Classifying Drug-Induced Liver Injury Risk

Frontiers in Artificial Intelligence ◽

10.3389/frai.2021.729834 ◽

2021 ◽

Vol 4 ◽

Author(s):

Yue Wu ◽

Zhichao Liu ◽

Leihong Wu ◽

Minjun Chen ◽

Weida Tong

Keyword(s):

United States ◽

Deep Learning ◽

Natural Language Processing ◽

Liver Injury ◽

Natural Language ◽

Language Processing ◽

The United States ◽

Drug Induced ◽

Drug Induced Liver Injury ◽

Drug Labeling

Background & Aims: The United States Food and Drug Administration (FDA) regulates a broad range of consumer products, which account for about 25% of the United States market. The FDA regulatory activities often involve producing and reading of a large number of documents, which is time consuming and labor intensive. To support regulatory science at FDA, we evaluated artificial intelligence (AI)-based natural language processing (NLP) of regulatory documents for text classification and compared deep learning-based models with a conventional keywords-based model.Methods: FDA drug labeling documents were used as a representative regulatory data source to classify drug-induced liver injury (DILI) risk by employing the state-of-the-art language model BERT. The resulting NLP-DILI classification model was statistically validated with both internal and external validation procedures and applied to the labeling data from the European Medicines Agency (EMA) for cross-agency application.Results: The NLP-DILI model developed using FDA labeling documents and evaluated by cross-validations in this study showed remarkable performance in DILI classification with a recall of 1 and a precision of 0.78. When cross-agency data were used to validate the model, the performance remained comparable, demonstrating that the model was portable across agencies. Results also suggested that the model was able to capture the semantic meanings of sentences in drug labeling.Conclusion: Deep learning-based NLP models performed well in DILI classification of drug labeling documents and learned the meanings of complex text in drug labeling. This proof-of-concept work demonstrated that using AI technologies to assist regulatory activities is a promising approach to modernize and advance regulatory science.

Download Full-text

A Study of the Effects of the COVID-19 Pandemic on the Experience of Back Pain Reported on Twitter® in the United States: A Natural Language Processing Approach

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18094543 ◽

2021 ◽

Vol 18 (9) ◽

pp. 4543

Author(s):

Krzysztof Fiok ◽

Waldemar Karwowski ◽

Edgar Gutierrez ◽

Maham Saeidi ◽

Awad M. Aljuaid ◽

...

Keyword(s):

United States ◽

Natural Language Processing ◽

Back Pain ◽

Natural Language ◽

Language Processing ◽

The United States ◽

Daily Routine ◽

Body Movements ◽

Data Source ◽

Twitter Users

The COVID-19 pandemic has changed our lifestyles, habits, and daily routine. Some of the impacts of COVID-19 have been widely reported already. However, many effects of the COVID-19 pandemic are still to be discovered. The main objective of this study was to assess the changes in the frequency of reported physical back pain complaints reported during the COVID-19 pandemic. In contrast to other published studies, we target the general population using Twitter as a data source. Specifically, we aim to investigate differences in the number of back pain complaints between the pre-pandemic and during the pandemic. A total of 53,234 and 78,559 tweets were analyzed for November 2019 and November 2020, respectively. Because Twitter users do not always complain explicitly when they tweet about the experience of back pain, we have designed an intelligent filter based on natural language processing (NLP) to automatically classify the examined tweets into the back pain complaining class and other tweets. Analysis of filtered tweets indicated an 84% increase in the back pain complaints reported in November 2020 compared to November 2019. These results might indicate significant changes in lifestyle during the COVID-19 pandemic, including restrictions in daily body movements and reduced exposure to routine physical exercise.

Download Full-text

Causes, Clinical Features, and Outcomes From a Prospective Study of Drug-Induced Liver Injury in the United States

Yearbook of Medicine ◽

10.1016/s0084-3873(09)79531-8 ◽

2009 ◽

Vol 2009 ◽

pp. 458-459

Author(s):

D.M. Harnois

Keyword(s):

United States ◽

Liver Injury ◽

Prospective Study ◽

Clinical Features ◽

The United States ◽

Drug Induced ◽

Drug Induced Liver Injury ◽

A Prospective Study

Download Full-text

Systematic Evaluation of Research Progress on Natural Language Processing in Medicine Over the Past 20 Years: Bibliometric Study on PubMed

Journal of Medical Internet Research ◽

10.2196/16816 ◽

2020 ◽

Vol 22 (1) ◽

pp. e16816 ◽

Cited By ~ 4

Author(s):

Jing Wang ◽

Huan Deng ◽

Bangtao Liu ◽

Anbin Hu ◽

Jun Liang ◽

...

Keyword(s):

United States ◽

Systematic Review ◽

Natural Language Processing ◽

Natural Language ◽

Medical Research ◽

Language Processing ◽

The United States ◽

Columbia University ◽

Medical Field ◽

Number Of Publications

Background Natural language processing (NLP) is an important traditional field in computer science, but its application in medical research has faced many challenges. With the extensive digitalization of medical information globally and increasing importance of understanding and mining big data in the medical field, NLP is becoming more crucial. Objective The goal of the research was to perform a systematic review on the use of NLP in medical research with the aim of understanding the global progress on NLP research outcomes, content, methods, and study groups involved. Methods A systematic review was conducted using the PubMed database as a search platform. All published studies on the application of NLP in medicine (except biomedicine) during the 20 years between 1999 and 2018 were retrieved. The data obtained from these published studies were cleaned and structured. Excel (Microsoft Corp) and VOSviewer (Nees Jan van Eck and Ludo Waltman) were used to perform bibliometric analysis of publication trends, author orders, countries, institutions, collaboration relationships, research hot spots, diseases studied, and research methods. Results A total of 3498 articles were obtained during initial screening, and 2336 articles were found to meet the study criteria after manual screening. The number of publications increased every year, with a significant growth after 2012 (number of publications ranged from 148 to a maximum of 302 annually). The United States has occupied the leading position since the inception of the field, with the largest number of articles published. The United States contributed to 63.01% (1472/2336) of all publications, followed by France (5.44%, 127/2336) and the United Kingdom (3.51%, 82/2336). The author with the largest number of articles published was Hongfang Liu (70), while Stéphane Meystre (17) and Hua Xu (33) published the largest number of articles as the first and corresponding authors. Among the first author’s affiliation institution, Columbia University published the largest number of articles, accounting for 4.54% (106/2336) of the total. Specifically, approximately one-fifth (17.68%, 413/2336) of the articles involved research on specific diseases, and the subject areas primarily focused on mental illness (16.46%, 68/413), breast cancer (5.81%, 24/413), and pneumonia (4.12%, 17/413). Conclusions NLP is in a period of robust development in the medical field, with an average of approximately 100 publications annually. Electronic medical records were the most used research materials, but social media such as Twitter have become important research materials since 2015. Cancer (24.94%, 103/413) was the most common subject area in NLP-assisted medical research on diseases, with breast cancers (23.30%, 24/103) and lung cancers (14.56%, 15/103) accounting for the highest proportions of studies. Columbia University and the talents trained therein were the most active and prolific research forces on NLP in the medical field.

Download Full-text

Drug-Induced Liver Injury Network Causality Assessment: Criteria and Experience in the United States

International Journal of Molecular Sciences ◽

10.3390/ijms17020201 ◽

2016 ◽

Vol 17 (2) ◽

pp. 201 ◽

Cited By ~ 38

Author(s):

Paul Hayashi

Keyword(s):

United States ◽

Liver Injury ◽

Causality Assessment ◽

The United States ◽

Assessment Criteria ◽

Drug Induced ◽

Drug Induced Liver Injury

Download Full-text

Etiology of New-Onset Jaundice: How Often Is It Caused by Idiosyncratic Drug-Induced Liver Injury in The United States?

The American Journal of Gastroenterology ◽

10.1111/j.1572-0241.2006.01019.x ◽

2007 ◽

Vol 102 (3) ◽

pp. 558-562 ◽

Cited By ~ 72

Author(s):

Raj Vuppalanchi ◽

Suthat Liangpunsakul ◽

Naga Chalasani

Keyword(s):

United States ◽

Liver Injury ◽

The United States ◽

Drug Induced ◽

Drug Induced Liver Injury ◽

New Onset

Download Full-text

Causes, Clinical Features, and Outcomes From a Prospective Study of Drug-Induced Liver Injury in the United States

Yearbook of Gastroenterology ◽

10.1016/s0739-5930(09)79290-6 ◽

2009 ◽

Vol 2009 ◽

pp. 216-217

Author(s):

D. Harnois

Keyword(s):

United States ◽

Liver Injury ◽

Prospective Study ◽

Clinical Features ◽

The United States ◽

Drug Induced ◽

Drug Induced Liver Injury ◽

A Prospective Study

Download Full-text

Drug-induced liver injury in the United States: A review of multi-ingredient supplements

Clinical Liver Disease ◽

10.1002/cld.535 ◽

2016 ◽

Vol 7 (3) ◽

pp. 60-63

Author(s):

Elizabeth Zheng ◽

Victor Navarro

Keyword(s):

United States ◽

Liver Injury ◽

The United States ◽

Drug Induced ◽

Drug Induced Liver Injury

Download Full-text

Etiology of New-Onset Jaundice: How Often Is It Caused by Idiosyncratic Drug-Induced Liver Injury in The United States?

Yearbook of Medicine ◽

10.1016/s0084-3873(08)79055-2 ◽

2008 ◽

Vol 2008 ◽

pp. 487-488

Author(s):

J.S. Barkin

Keyword(s):

United States ◽

Liver Injury ◽

The United States ◽

Drug Induced ◽

Drug Induced Liver Injury ◽

New Onset

Download Full-text

Causes, Clinical Features, and Outcomes From a Prospective Study of Drug-Induced Liver Injury in the United States

Gastroenterology ◽

10.1053/j.gastro.2008.09.011 ◽

2008 ◽

Vol 135 (6) ◽

pp. 1924-1934.e4 ◽

Cited By ~ 518

Author(s):

Naga Chalasani ◽

Robert J. Fontana ◽

Herbert L. Bonkovsky ◽

Paul B. Watkins ◽

Timothy Davern ◽

...

Keyword(s):

United States ◽

Liver Injury ◽

Prospective Study ◽

Clinical Features ◽

The United States ◽

Drug Induced ◽

Drug Induced Liver Injury ◽

A Prospective Study

Download Full-text

Systematic Evaluation of Research Progress on Natural Language Processing in Medicine Over the Past 20 Years: Bibliometric Study on PubMed (Preprint)

10.2196/preprints.16816 ◽

2019 ◽

Author(s):

Jing Wang ◽

Huan Deng ◽

Bangtao Liu ◽

Anbin Hu ◽

Jun Liang ◽

...

Keyword(s):

United States ◽

Systematic Review ◽

Natural Language Processing ◽

Natural Language ◽

Medical Research ◽

Language Processing ◽

The United States ◽

Columbia University ◽

Medical Field ◽

Number Of Publications

BACKGROUND Natural language processing (NLP) is an important traditional field in computer science, but its application in medical research has faced many challenges. With the extensive digitalization of medical information globally and increasing importance of understanding and mining big data in the medical field, NLP is becoming more crucial. OBJECTIVE The goal of the research was to perform a systematic review on the use of NLP in medical research with the aim of understanding the global progress on NLP research outcomes, content, methods, and study groups involved. METHODS A systematic review was conducted using the PubMed database as a search platform. All published studies on the application of NLP in medicine (except biomedicine) during the 20 years between 1999 and 2018 were retrieved. The data obtained from these published studies were cleaned and structured. Excel (Microsoft Corp) and VOSviewer (Nees Jan van Eck and Ludo Waltman) were used to perform bibliometric analysis of publication trends, author orders, countries, institutions, collaboration relationships, research hot spots, diseases studied, and research methods. RESULTS A total of 3498 articles were obtained during initial screening, and 2336 articles were found to meet the study criteria after manual screening. The number of publications increased every year, with a significant growth after 2012 (number of publications ranged from 148 to a maximum of 302 annually). The United States has occupied the leading position since the inception of the field, with the largest number of articles published. The United States contributed to 63.01% (1472/2336) of all publications, followed by France (5.44%, 127/2336) and the United Kingdom (3.51%, 82/2336). The author with the largest number of articles published was Hongfang Liu (70), while Stéphane Meystre (17) and Hua Xu (33) published the largest number of articles as the first and corresponding authors. Among the first author’s affiliation institution, Columbia University published the largest number of articles, accounting for 4.54% (106/2336) of the total. Specifically, approximately one-fifth (17.68%, 413/2336) of the articles involved research on specific diseases, and the subject areas primarily focused on mental illness (16.46%, 68/413), breast cancer (5.81%, 24/413), and pneumonia (4.12%, 17/413). CONCLUSIONS NLP is in a period of robust development in the medical field, with an average of approximately 100 publications annually. Electronic medical records were the most used research materials, but social media such as Twitter have become important research materials since 2015. Cancer (24.94%, 103/413) was the most common subject area in NLP-assisted medical research on diseases, with breast cancers (23.30%, 24/103) and lung cancers (14.56%, 15/103) accounting for the highest proportions of studies. Columbia University and the talents trained therein were the most active and prolific research forces on NLP in the medical field.

Download Full-text