Forty-two Million Ways to Describe Pain: Topic Modeling of 200,000 PubMed Pain-Related Abstracts Using Natural Language Processing and Deep Learning–Based Text Generation

Abstract Objective Recent efforts to update the definitions and taxonomic structure of concepts related to pain have revealed opportunities to better quantify topics of existing pain research subject areas. Methods Here, we apply basic natural language processing (NLP) analyses on a corpus of >200,000 abstracts published on PubMed under the medical subject heading (MeSH) of “pain” to quantify the topics, content, and themes on pain-related research dating back to the 1940s. Results The most common stemmed terms included “pain” (601,122 occurrences), “patient” (508,064 occurrences), and “studi-” (208,839 occurrences). Contrarily, terms with the highest term frequency–inverse document frequency included “tmd” (6.21), “qol” (6.01), and “endometriosis” (5.94). Using the vector-embedded model of term definitions available via the “word2vec” technique, the most similar terms to “pain” included “discomfort,” “symptom,” and “pain-related.” For the term “acute,” the most similar terms in the word2vec vector space included “nonspecific,” “vaso-occlusive,” and “subacute”; for the term “chronic,” the most similar terms included “persistent,” “longstanding,” and “long-standing.” Topic modeling via Latent Dirichlet analysis identified peak coherence (0.49) at 40 topics. Network analysis of these topic models identified three topics that were outliers from the core cluster, two of which pertained to women’s health and obstetrics and were closely connected to one another, yet considered distant from the third outlier pertaining to age. A deep learning–based gated recurrent units abstract generation model successfully synthesized several unique abstracts with varying levels of believability, with special attention and some confusion at lower temperatures to the roles of placebo in randomized controlled trials. Conclusions Quantitative NLP models of published abstracts pertaining to pain may point to trends and gaps within pain research communities.

Download Full-text

Arabic Poem Generation Incorporating Deep Learning and Phonetic CNNsubword Embedding Models

International Journal of Robotic Computing ◽

10.35708/tai1868-126246 ◽

2019 ◽

pp. 64-91

Author(s):

Sameerah Talafha ◽

Banafsheh Rekabdar

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Model Performance ◽

Arabic Language ◽

Generation Model ◽

Two Stage ◽

Human Evaluation ◽

Effective Contribution

Arabic poetry generation is a very challenging task since the linguistic structure of the Arabic language is considered a severe challenge for many researchers and developers in the Natural Language Processing (NLP) field. In this paper, we propose a poetry generation model with extended phonetic and semantic embeddings (Phonetic CNNsubword embeddings). We show that Phonetic CNNsubword embeddings have an effective contribution to the overall model performance compared to FastTextsubword embeddings. Our poetry generation model consists of a two-stage approach: (1.) generating the first verse which explicitly incorporates the theme related phrase, (2.) other verses generation with the proposed Hierarchy-Attention Sequence-to-Sequence model (HAS2S), which adequately capture word, phrase, and verse information between contexts. A comprehensive human evaluation confirms that the poems generated by our model outperform the base models in criteria such as Meaning, Coherence, Fluency, and Poeticness. Extensive quantitative experiments using Bi-Lingual Evaluation Understudy (BLEU) scores also demonstrate significant improvements over strong baselines.

Download Full-text

Daily estimates of individual discharge likelihood with deep learning natural language processing in general medicine: a prospective and external validation study

Internal and Emergency Medicine ◽

10.1007/s11739-021-02816-7 ◽

2021 ◽

Author(s):

Stephen Bacchi ◽

Toby Gilbert ◽

Samuel Gluck ◽

Joy Cheng ◽

Yiran Tan ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Validation Study ◽

External Validation ◽

General Medicine ◽

External Validation Study

Download Full-text

Deep Learning on Graphs for Natural Language Processing

Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval ◽

10.1145/3404835.3462809 ◽

2021 ◽

Author(s):

Lingfei Wu ◽

Yu Chen ◽

Heng Ji ◽

Bang Liu

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing

Download Full-text

Deep Learning Techniques on Text Classification Using Natural Language Processing (NLP) In Social Healthcare Network: A Comprehensive Survey

2021 3rd International Conference on Signal Processing and Communication (ICPSC) ◽

10.1109/icspc51351.2021.9451752 ◽

2021 ◽

Author(s):

PM. Lavanya ◽

E. Sasikala

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Classification ◽

Healthcare Network ◽

Learning Techniques ◽

Comprehensive Survey

Download Full-text

A natural language processing approach based on embedding deep learning from heterogeneous compounds for quantitative structure–activity relationship modeling

Chemical Biology & Drug Design ◽

10.1111/cbdd.13742 ◽

2020 ◽

Vol 96 (3) ◽

pp. 961-972

Author(s):

Khalid Bouhedjar ◽

Abdelbasset Boukelia ◽

Abdelmalek Khorief Nacereddine ◽

Anouar Boucheham ◽

Amine Belaidi ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Quantitative Structure Activity Relationship ◽

Structure Activity Relationship ◽

Activity Relationship ◽

Quantitative Structure ◽

Structure Activity ◽

Processing Approach

Download Full-text

Speech Master: Natural Language Processing and Deep Learning Approach for Automated Speech Evaluation

10.1109/iemcon53756.2021.9623163 ◽

2021 ◽

Author(s):

K.G.C.M Kooragama ◽

L.R.W.D. Jayashanka ◽

J.A. Munasinghe ◽

K.W. Jayawardana ◽

Muditha Tissera ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Approach ◽

Speech Evaluation

Download Full-text

Deep Learning Approaches for Spoken and Natural Language Processing

10.1007/978-3-030-79778-2 ◽

2021 ◽

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Approaches

Download Full-text

Use of Natural Language Processing and Deep Learning towards Guiding Healthy Cholesterol Free Life

10.1109/icac54203.2021.9671230 ◽

2021 ◽

Author(s):

Dilith Sasanka ◽

H. K. N Malshani ◽

Uchitha I. Wickramaratne ◽

Yashmitha Kavindi ◽

Muditha Tissera ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing

Download Full-text

An Evaluation of Patient Safety Event Report Categories Using Unsupervised Topic Modeling

Methods of Information in Medicine ◽

10.3414/me15-01-0010 ◽

2015 ◽

Vol 54 (04) ◽

pp. 338-345 ◽

Cited By ~ 10

Author(s):

A. Fong ◽

R. Ratwani

Keyword(s):

Patient Safety ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Topic Modeling ◽

Free Text ◽

Event Data ◽

Event Type ◽

Modeling Approach ◽

Safety Event

SummaryObjective: Patient safety event data repositories have the potential to dramatically improve safety if analyzed and leveraged appropriately. These safety event reports often consist of both structured data, such as general event type categories, and unstructured data, such as free text descriptions of the event. Analyzing these data, particularly the rich free text narratives, can be challenging, especially with tens of thousands of reports. To overcome the resource intensive manual review process of the free text descriptions, we demonstrate the effectiveness of using an unsupervised natural language processing approach.Methods: An unsupervised natural language processing technique, called topic modeling, was applied to a large repository of patient safety event data to identify topics, or themes, from the free text descriptions of the data. Entropy measures were used to evaluate and compare these topics to the general event type categories that were originally assigned by the event reporter.Results: Measures of entropy demonstrated that some topics generated from the un-supervised modeling approach aligned with the clinical general event type categories that were originally selected by the individual entering the report. Importantly, several new latent topics emerged that were not originally identified. The new topics provide additional insights into the patient safety event data that would not otherwise easily be detected.Conclusion: The topic modeling approach provides a method to identify topics or themes that may not be immediately apparent and has the potential to allow for automatic reclassification of events that are ambiguously classified by the event reporter.

Download Full-text

Automatic ICD-10 Coding and Training System: Deep Neural Network Based on Supervised Learning

JMIR Medical Informatics ◽

10.2196/23230 ◽

2021 ◽

Vol 9 (8) ◽

pp. e23230

Author(s):

Pei-Fu Chen ◽

Ssu-Ming Wang ◽

Wei-Chih Liao ◽

Lu-Cheng Kuo ◽

Kuan-Chih Chen ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Network ◽

University Hospital ◽

Classification Model ◽

Icd 10 ◽

And Training

Background The International Classification of Diseases (ICD) code is widely used as the reference in medical system and billing purposes. However, classifying diseases into ICD codes still mainly relies on humans reading a large amount of written material as the basis for coding. Coding is both laborious and time-consuming. Since the conversion of ICD-9 to ICD-10, the coding task became much more complicated, and deep learning– and natural language processing–related approaches have been studied to assist disease coders. Objective This paper aims at constructing a deep learning model for ICD-10 coding, where the model is meant to automatically determine the corresponding diagnosis and procedure codes based solely on free-text medical notes to improve accuracy and reduce human effort. Methods We used diagnosis records of the National Taiwan University Hospital as resources and apply natural language processing techniques, including global vectors, word to vectors, embeddings from language models, bidirectional encoder representations from transformers, and single head attention recurrent neural network, on the deep neural network architecture to implement ICD-10 auto-coding. Besides, we introduced the attention mechanism into the classification model to extract the keywords from diagnoses and visualize the coding reference for training freshmen in ICD-10. Sixty discharge notes were randomly selected to examine the change in the F1-score and the coding time by coders before and after using our model. Results In experiments on the medical data set of National Taiwan University Hospital, our prediction results revealed F1-scores of 0.715 and 0.618 for the ICD-10 Clinical Modification code and Procedure Coding System code, respectively, with a bidirectional encoder representations from transformers embedding approach in the Gated Recurrent Unit classification model. The well-trained models were applied on the ICD-10 web service for coding and training to ICD-10 users. With this service, coders can code with the F1-score significantly increased from a median of 0.832 to 0.922 (P<.05), but not in a reduced interval. Conclusions The proposed model significantly improved the F1-score but did not decrease the time consumed in coding by disease coders.

Download Full-text