Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers

Abstract Background It is essential for radiologists to communicate actionable findings to the referring clinicians reliably. Natural language processing (NLP) has been shown to help identify free-text radiology reports including actionable findings. However, the application of recent deep learning techniques to radiology reports, which can improve the detection performance, has not been thoroughly examined. Moreover, free-text that clinicians input in the ordering form (order information) has seldom been used to identify actionable reports. This study aims to evaluate the benefits of two new approaches: (1) bidirectional encoder representations from transformers (BERT), a recent deep learning architecture in NLP, and (2) using order information in addition to radiology reports. Methods We performed a binary classification to distinguish actionable reports (i.e., radiology reports tagged as actionable in actual radiological practice) from non-actionable ones (those without an actionable tag). 90,923 Japanese radiology reports in our hospital were used, of which 788 (0.87%) were actionable. We evaluated four methods, statistical machine learning with logistic regression (LR) and with gradient boosting decision tree (GBDT), and deep learning with a bidirectional long short-term memory (LSTM) model and a publicly available Japanese BERT model. Each method was used with two different inputs, radiology reports alone and pairs of order information and radiology reports. Thus, eight experiments were conducted to examine the performance. Results Without order information, BERT achieved the highest area under the precision-recall curve (AUPRC) of 0.5138, which showed a statistically significant improvement over LR, GBDT, and LSTM, and the highest area under the receiver operating characteristic curve (AUROC) of 0.9516. Simply coupling the order information with the radiology reports slightly increased the AUPRC of BERT but did not lead to a statistically significant improvement. This may be due to the complexity of clinical decisions made by radiologists. Conclusions BERT was assumed to be useful to detect actionable reports. More sophisticated methods are required to use order information effectively.

Download Full-text

A Highly Generalizable Natural Language Processing Algorithm for the Diagnosis of Pulmonary Embolism from Radiology Reports

10.1101/2020.10.13.20211961 ◽

2020 ◽

Author(s):

Jacob Johnson ◽

Grace Qiu ◽

Christine Lamoureux ◽

Jennifer Ngo ◽

Lawrence Ngo

Keyword(s):

Pulmonary Embolism ◽

Deep Learning ◽

Natural Language Processing ◽

Sample Size ◽

Language Processing ◽

High Accuracy ◽

Free Text ◽

Radiology Reports ◽

Natural Language Processing Algorithm

AbstractThough sophisticated algorithms have been developed for the classification of free-text radiology reports for pulmonary embolism (PE), their overall generalizability remains unvalidated given limitations in sample size and data homogeneity. We developed and validated a highly generalizable deep-learning based NLP algorithm for this purpose with data sourced from over 2,000 hospital sites and 500 radiologists. The algorithm achieved an AUCROC of 0.995 on chest angiography studies and 0.994 on non-angiography studies for the presence or absence of PE. The high accuracy achieved on this large and heterogeneous dataset allows for the possibility of application in large multi-center radiology practices as well as for deployment at novel sites without significant degradation in performance.

Download Full-text

Classification of Twitter Vaping Discourse Using BERTweet: Comparative Deep Learning Study (Preprint)

10.2196/preprints.33678 ◽

2021 ◽

Author(s):

Alycia Noel Carey ◽

William Baker ◽

Jason B. Colditz ◽

Huy Mai ◽

Shyam Visweswaran ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Short Term Memory ◽

Characteristic Curve ◽

Data Sets ◽

Learning Approaches ◽

Sensitive Data ◽

Twitter Data ◽

Traditional Natural

BACKGROUND Twitter provides a valuable platform for the surveillance and monitoring of public health topics; however, manually categorizing large quantities of Twitter data is labor intensive and presents barriers to identify major trends and sentiments. Additionally, while machine and deep learning approaches have been proposed with high accuracy, they require large, annotated data sets. Public pre-trained deep learning classification models, such as BERTweet, produce higher quality models while using smaller annotated training sets. OBJECTIVE This study aims to derive and evaluate a pre-trained deep learning model based on BERTweet that can identify tweets relevant to vaping, tweets (related to vaping) of commercial nature, and tweets with pro-vape sentiment. Additionally, the performance of the BERTweet classifier will be compared against a long short-term memory (LSTM) model to show the improvements a pre-trained model has over traditional deep learning approaches. METHODS Twitter data were collected from August – October 2019 using vaping related search terms. From this set, a random subsample of 2,401 English tweets was manually annotated for relevance (vaping related or not), commercial nature (commercial or not), and sentiment (positive, negative, neutral). Using the annotated data, three separate classifiers were built using BERTweet with the default parameters defined by the Simple Transformer API. Each model was trained for 20 iterations and evaluated with a random split of the annotate tweets, reserving 10% of tweets for evaluations. RESULTS The relevance, commercial, and sentiment classifiers achieved an area under the receiver operating characteristic curve (AUROC) of 94.5%, 99.3%, and 81.7%, respectively. Additionally, the weighted F1 scores of each were 97.6%, 99.0%, and 86.1%. We found that BERTweet outperformed the LSTM model in classification of all categories. CONCLUSIONS Large, open-source deep learning classifiers, such as BERTweet, can provide researchers the ability to reliably determine if tweets are relevant to vaping, include commercial content, and include positive, negative, or neutral content about vaping with a higher accuracy than traditional Natural Language Processing deep learning models. Such enhancement to the utilization of Twitter data can allow for faster exploration and dissemination of time-sensitive data than traditional methodologies (e.g., surveys, polling research).

Download Full-text

A Study on 3D Deep Learning-Based Automatic Diagnosis of Nasal Fractures

Sensors ◽

10.3390/s22020506 ◽

2022 ◽

Vol 22 (2) ◽

pp. 506

Author(s):

Yu-Jin Seol ◽

Young-Jae Kim ◽

Yoon-Sang Kim ◽

Young-Woo Cheon ◽

Kwang-Gi Kim

Keyword(s):

Deep Learning ◽

Binary Classification ◽

Characteristic Curve ◽

Nasal Bone ◽

Facial Trauma ◽

Future Research ◽

Automatic Diagnosis ◽

3 Dimensional ◽

Computed Tomography Images ◽

Nasal Fractures

This paper reported a study on the 3-dimensional deep-learning-based automatic diagnosis of nasal fractures. (1) Background: The nasal bone is the most protuberant feature of the face; therefore, it is highly vulnerable to facial trauma and its fractures are known as the most common facial fractures worldwide. In addition, its adhesion causes rapid deformation, so a clear diagnosis is needed early after fracture onset. (2) Methods: The collected computed tomography images were reconstructed to isotropic voxel data including the whole region of the nasal bone, which are represented in a fixed cubic volume. The configured 3-dimensional input data were then automatically classified by the deep learning of residual neural networks (3D-ResNet34 and ResNet50) with the spatial context information using a single network, whose performance was evaluated by 5-fold cross-validation. (3) Results: The classification of nasal fractures with simple 3D-ResNet34 and ResNet50 networks achieved areas under the receiver operating characteristic curve of 94.5% and 93.4% for binary classification, respectively, both indicating unprecedented high performance in the task. (4) Conclusions: In this paper, it is presented the possibility of automatic nasal bone fracture diagnosis using a 3-dimensional Resnet-based single classification network and it will improve the diagnostic environment with future research.

Download Full-text

Deep Learning-Based Sentiment Analysis of COVID-19 Vaccination Responses from Twitter Data

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/4321131 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Kazi Nabiul Alam ◽

Md Shakib Khan ◽

Abdur Rab Dhruba ◽

Mohammad Monirujjaman Khan ◽

Jehad F. Al-Amri ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Performance Metrics ◽

Short Term Memory ◽

Confusion Matrix ◽

Short Term ◽

Learning Techniques ◽

The World ◽

Long Short Term Memory ◽

Severe Anxiety

The COVID-19 pandemic has had a devastating effect on many people, creating severe anxiety, fear, and complicated feelings or emotions. After the initiation of vaccinations against coronavirus, people’s feelings have become more diverse and complex. Our aim is to understand and unravel their sentiments in this research using deep learning techniques. Social media is currently the best way to express feelings and emotions, and with the help of Twitter, one can have a better idea of what is trending and going on in people’s minds. Our motivation for this research was to understand the diverse sentiments of people regarding the vaccination process. In this research, the timeline of the collected tweets was from December 21 to July21. The tweets contained information about the most common vaccines available recently from across the world. The sentiments of people regarding vaccines of all sorts were assessed using the natural language processing (NLP) tool, Valence Aware Dictionary for sEntiment Reasoner (VADER). Initializing the polarities of the obtained sentiments into three groups (positive, negative, and neutral) helped us visualize the overall scenario; our findings included 33.96% positive, 17.55% negative, and 48.49% neutral responses. In addition, we included our analysis of the timeline of the tweets in this research, as sentiments fluctuated over time. A recurrent neural network- (RNN-) oriented architecture, including long short-term memory (LSTM) and bidirectional LSTM (Bi-LSTM), was used to assess the performance of the predictive models, with LSTM achieving an accuracy of 90.59% and Bi-LSTM achieving 90.83%. Other performance metrics such as precision,, F1-score, and a confusion matrix were also used to validate our models and findings more effectively. This study improves understanding of the public’s opinion on COVID-19 vaccines and supports the aim of eradicating coronavirus from the world.

Download Full-text

Chinese Text Classification Model Based on Deep Learning

Future Internet ◽

10.3390/fi10110113 ◽

2018 ◽

Vol 10 (11) ◽

pp. 113 ◽

Cited By ~ 17

Author(s):

Yue Li ◽

Xutao Wang ◽

Pengjian Xu

Keyword(s):

Neural Network ◽

Deep Learning ◽

Language Processing ◽

Chinese Text ◽

Text Classification ◽

Short Term Memory ◽

Classification Model ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Text classification is of importance in natural language processing, as the massive text information containing huge amounts of value needs to be classified into different categories for further use. In order to better classify text, our paper tries to build a deep learning model which achieves better classification results in Chinese text than those of other researchers’ models. After comparing different methods, long short-term memory (LSTM) and convolutional neural network (CNN) methods were selected as deep learning methods to classify Chinese text. LSTM is a special kind of recurrent neural network (RNN), which is capable of processing serialized information through its recurrent structure. By contrast, CNN has shown its ability to extract features from visual imagery. Therefore, two layers of LSTM and one layer of CNN were integrated to our new model: the BLSTM-C model (BLSTM stands for bi-directional long short-term memory while C stands for CNN.) LSTM was responsible for obtaining a sequence output based on past and future contexts, which was then input to the convolutional layer for extracting features. In our experiments, the proposed BLSTM-C model was evaluated in several ways. In the results, the model exhibited remarkable performance in text classification, especially in Chinese texts.

Download Full-text

Expert guided natural language processing using one-class classification

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocv010 ◽

2015 ◽

Vol 22 (5) ◽

pp. 962-966 ◽

Cited By ~ 5

Author(s):

Erel Joffe ◽

Emily J Pettigrew ◽

Jorge R Herskovic ◽

Charles F Bearden ◽

Elmer V Bernstam

Keyword(s):

Breast Cancer ◽

Language Processing ◽

Text Processing ◽

Binary Classification ◽

Model Performance ◽

Imbalanced Data ◽

Superior Performance ◽

Support Vector ◽

Free Text ◽

One Class Classification

Abstract Introduction Automatically identifying specific phenotypes in free-text clinical notes is critically important for the reuse of clinical data. In this study, the authors combine expert-guided feature (text) selection with one-class classification for text processing. Objectives To compare the performance of one-class classification to traditional binary classification; to evaluate the utility of feature selection based on expert-selected salient text (snippets); and to determine the robustness of these models with respects to irrelevant surrounding text. Methods The authors trained one-class support vector machines (1C-SVMs) and two-class SVMs (2C-SVMs) to identify notes discussing breast cancer. Manually annotated visit summary notes (88 positive and 88 negative for breast cancer) were used to compare the performance of models trained on whole notes labeled as positive or negative to models trained on expert-selected text sections (snippets) relevant to breast cancer status. Model performance was evaluated using a 70:30 split for 20 iterations and on a realistic dataset of 10 000 records with a breast cancer prevalence of 1.4%. Results When tested on a balanced experimental dataset, 1C-SVMs trained on snippets had comparable results to 2C-SVMs trained on whole notes (F = 0.92 for both approaches). When evaluated on a realistic imbalanced dataset, 1C-SVMs had a considerably superior performance (F = 0.61 vs. F = 0.17 for the best performing model) attributable mainly to improved precision (p = .88 vs. p = .09 for the best performing model). Conclusions 1C-SVMs trained on expert-selected relevant text sections perform better than 2C-SVMs classifiers trained on either snippets or whole notes when applied to realistically imbalanced data with low prevalence of the positive class.

Download Full-text

COVID-19 ChatBot

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38757 ◽

2021 ◽

Vol 9 (11) ◽

pp. 44-49

Author(s):

Satish Tirumalapudi

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

And Control ◽

Prediction Problems

Abstract: Chat bots are software applications that help users to communicate with the machine and get the required result, this is where Natural Language Processing (NLP) comes into the picture. Natural language processing is based on deep learning that enables computers to acquire meaning from inputs given by the users. Natural language processing techniques can make possible the use of natural language to express ideas, thus drastically increasing accessibility. NLP engines rely on the elements of intent, utterance, entity, context, and session. Here in this project, we will be using Deep learning techniques which will be trained on the dataset which contains categories, patterns, and responses. Long Short-Term Memory (LSTM) is a Recurrent Neural Network that is capable of learning order dependence in sequence prediction problems. One of the most popular RNN approaches is LSTM to identify and control a dynamic system. We use an RNN to classify the category user’s message belongs to and then will give a response from the list of responses. Keywords: NLP – Natural Language Processing, LSTM – Long Short Term Memory, RNN – Recurrent Neural Networks.

Download Full-text

DICE: A Drug Indication Classification and Encyclopedia for AI-Based Indication Extraction

Frontiers in Artificial Intelligence ◽

10.3389/frai.2021.711467 ◽

2021 ◽

Vol 4 ◽

Author(s):

Arjun Bhatt ◽

Ruth Roberts ◽

Xi Chen ◽

Ting Li ◽

Skylar Connor ◽

...

Keyword(s):

Language Processing ◽

Clinical Decision Making ◽

Short Term Memory ◽

Drug Repositioning ◽

External Validation ◽

Clinical Decision ◽

Language Models ◽

Free Text ◽

Drug Labeling ◽

Real World Evidence

Drug labeling contains an ‘INDICATIONS AND USAGE’ that provides vital information to support clinical decision making and regulatory management. Effective extraction of drug indication information from free-text based resources could facilitate drug repositioning projects and help collect real-world evidence in support of secondary use of approved medicines. To enable AI-powered language models for the extraction of drug indication information, we used manual reading and curation to develop a Drug Indication Classification and Encyclopedia (DICE) based on FDA approved human prescription drug labeling. A DICE scheme with 7,231 sentences categorized into five classes (indications, contradictions, side effects, usage instructions, and clinical observations) was developed. To further elucidate the utility of the DICE, we developed nine different AI-based classifiers for the prediction of indications based on the developed DICE to comprehensively assess their performance. We found that the transformer-based language models yielded an average MCC of 0.887, outperforming the word embedding-based Bidirectional long short-term memory (BiLSTM) models (0.862) with a 2.82% improvement on the test set. The best classifiers were also used to extract drug indication information in DrugBank and achieved a high enrichment rate (>0.930) for this task. We found that domain-specific training could provide more explainable models without performance sacrifices and better generalization for external validation datasets. Altogether, the proposed DICE could be a standard resource for the development and evaluation of task-specific AI-powered, natural language processing (NLP) models.

Download Full-text

A comparative analysis on question classification task based on deep learning approaches

PeerJ Computer Science ◽

10.7717/peerj-cs.570 ◽

2021 ◽

Vol 7 ◽

pp. e570

Author(s):

Muhammad Zulqarnain ◽

Ahmed Khalaf Zager Alsaedi ◽

Rozaida Ghazali ◽

Muhammad Ghulam Ghouse ◽

Wareesa Sharif ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Web Mining ◽

Question Answering ◽

Short Term Memory ◽

Learning Approaches ◽

Question Classification ◽

Considerable Impact ◽

Turkish Language ◽

Learning Architectures

Question classification is one of the essential tasks for automatic question answering implementation in natural language processing (NLP). Recently, there have been several text-mining issues such as text classification, document categorization, web mining, sentiment analysis, and spam filtering that have been successfully achieved by deep learning approaches. In this study, we illustrated and investigated our work on certain deep learning approaches for question classification tasks in an extremely inflected Turkish language. In this study, we trained and tested the deep learning architectures on the questions dataset in Turkish. In addition to this, we used three main deep learning approaches (Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN)) and we also applied two different deep learning combinations of CNN-GRU and CNN-LSTM architectures. Furthermore, we applied the Word2vec technique with both skip-gram and CBOW methods for word embedding with various vector sizes on a large corpus composed of user questions. By comparing analysis, we conducted an experiment on deep learning architectures based on test and 10-cross fold validation accuracy. Experiment results were obtained to illustrate the effectiveness of various Word2vec techniques that have a considerable impact on the accuracy rate using different deep learning approaches. We attained an accuracy of 93.7% by using these techniques on the question dataset.

Download Full-text

Exploring the Impact of COVID-19 on Social Life by Deep Learning

Information ◽

10.3390/info12110459 ◽

2021 ◽

Vol 12 (11) ◽

pp. 459

Author(s):

Jose Antonio Jijon-Vorbeck ◽

Isabel Segura-Bedmar

Keyword(s):

Deep Learning ◽

Language Processing ◽

Social Life ◽

Data Science ◽

Short Term Memory ◽

Negative Impact ◽

Machine Learning Algorithms ◽

World Health ◽

Health Organization ◽

The Impact

Due to the globalisation of the COVID-19 pandemic, and the expansion of social media as the main source of information for many people, there have been a great variety of different reactions surrounding the topic. The World Health Organization (WHO) announced in December 2020 that they were currently fighting an “infodemic” in the same way as they were fighting the pandemic. An “infodemic” relates to the spread of information that is not controlled or filtered, and can have a negative impact on society. If not managed properly, an aggressive or negative tweet can be very harmful and misleading among its recipients. Therefore, authorities at WHO have called for action and asked the academic and scientific community to develop tools for managing the infodemic by the use of digital technologies and data science. The goal of this study is to develop and apply natural language processing models using deep learning to classify a collection of tweets that refer to the COVID-19 pandemic. Several simpler and widely used models are applied first and serve as a benchmark for deep learning methods, such as Long Short-Term Memory (LSTM) and Bidirectional Encoder Representations from Transformers (BERT). The results of the experiments show that the deep learning models outperform the traditional machine learning algorithms. The best approach is the BERT-based model.

Download Full-text