Clinical concept extraction using transformers

Abstract Objective The goal of this study is to explore transformer-based models (eg, Bidirectional Encoder Representations from Transformers [BERT]) for clinical concept extraction and develop an open-source package with pretrained clinical models to facilitate concept extraction and other downstream natural language processing (NLP) tasks in the medical domain. Methods We systematically explored 4 widely used transformer-based architectures, including BERT, RoBERTa, ALBERT, and ELECTRA, for extracting various types of clinical concepts using 3 public datasets from the 2010 and 2012 i2b2 challenges and the 2018 n2c2 challenge. We examined general transformer models pretrained using general English corpora as well as clinical transformer models pretrained using a clinical corpus and compared them with a long short-term memory conditional random fields (LSTM-CRFs) mode as a baseline. Furthermore, we integrated the 4 clinical transformer-based models into an open-source package. Results and Conclusion The RoBERTa-MIMIC model achieved state-of-the-art performance on 3 public clinical concept extraction datasets with F1-scores of 0.8994, 0.8053, and 0.8907, respectively. Compared to the baseline LSTM-CRFs model, RoBERTa-MIMIC remarkably improved the F1-score by approximately 4% and 6% on the 2010 and 2012 i2b2 datasets. This study demonstrated the efficiency of transformer-based models for clinical concept extraction. Our methods and systems can be applied to other clinical tasks. The clinical transformer package with 4 pretrained clinical models is publicly available at https://github.com/uf-hobi-informatics-lab/ClinicalTransformerNER. We believe this package will improve current practice on clinical concept extraction and other tasks in the medical domain.

Download Full-text

A Deep Learning Approach for the Romanized Tunisian Dialect Identification

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/6/12 ◽

2020 ◽

Vol 17 (6) ◽

pp. 935-946

Author(s):

Jihene Younes ◽

Hadhemi Achour ◽

Emna Souissi ◽

Ahmed Ferchichi

Keyword(s):

Deep Learning ◽

Language Processing ◽

Conditional Random Fields ◽

Short Term Memory ◽

Learning Approach ◽

Short Term ◽

The Social ◽

Long Short Term Memory ◽

N Gram ◽

Textual Content

Language identification is an important task in natural language processing that consists in determining the language of a given text. It has increasingly picked the interest of researchers for the past few years, especially for code-switching informal textual content. In this paper, we focus on the identification of the Romanized user-generated Tunisian dialect on the social web. We segment and annotate a corpus extracted from social media and propose a deep learning approach for the identification task. We use a Bidirectional Long Short-Term Memory neural network with Conditional Random Fields decoding (BLSTM-CRF). For word embeddings, we combine word-character BLSTM vector representation and Fast Text embeddings that takes into consideration character n-gram features. The overall accuracy obtained is 98.65%.

Download Full-text

Extracting Family History of Patients From Clinical Narratives: Exploring an End-to-End Solution With Deep Learning Models (Preprint)

10.2196/preprints.22982 ◽

2020 ◽

Author(s):

Xi Yang ◽

Hansi Zhang ◽

Xing He ◽

Jiang Bian ◽

Yonghui Wu

Keyword(s):

Deep Learning ◽

Family History ◽

Information Extraction ◽

Language Processing ◽

Conditional Random Fields ◽

Short Term Memory ◽

Majority Voting ◽

Learning Models ◽

Concept Extraction ◽

End To End

BACKGROUND Patients’ family history (FH) is a critical risk factor associated with numerous diseases. However, FH information is not well captured in the structured database but often documented in clinical narratives. Natural language processing (NLP) is the key technology to extract patients’ FH from clinical narratives. In 2019, the National NLP Clinical Challenge (n2c2) organized shared tasks to solicit NLP methods for FH information extraction. OBJECTIVE This study presents our end-to-end FH extraction system developed during the 2019 n2c2 open shared task as well as the new transformer-based models that we developed after the challenge. We seek to develop a machine learning–based solution for FH information extraction without task-specific rules created by hand. METHODS We developed deep learning–based systems for FH concept extraction and relation identification. We explored deep learning models including long short-term memory-conditional random fields and bidirectional encoder representations from transformers (BERT) as well as developed ensemble models using a majority voting strategy. To further optimize performance, we systematically compared 3 different strategies to use BERT output representations for relation identification. RESULTS Our system was among the top-ranked systems (3 out of 21) in the challenge. Our best system achieved micro-averaged F1 scores of 0.7944 and 0.6544 for concept extraction and relation identification, respectively. After challenge, we further explored new transformer-based models and improved the performances of both subtasks to 0.8249 and 0.6775, respectively. For relation identification, our system achieved a performance comparable to the best system (0.6810) reported in the challenge. CONCLUSIONS This study demonstrated the feasibility of utilizing deep learning methods to extract FH information from clinical narratives.

Download Full-text

Extracting Family History of Patients From Clinical Narratives: Exploring an End-to-End Solution With Deep Learning Models

JMIR Medical Informatics ◽

10.2196/22982 ◽

2020 ◽

Vol 8 (12) ◽

pp. e22982

Author(s):

Xi Yang ◽

Hansi Zhang ◽

Xing He ◽

Jiang Bian ◽

Yonghui Wu

Keyword(s):

Deep Learning ◽

Family History ◽

Information Extraction ◽

Language Processing ◽

Conditional Random Fields ◽

Short Term Memory ◽

Majority Voting ◽

Learning Models ◽

Concept Extraction ◽

End To End

Background Patients’ family history (FH) is a critical risk factor associated with numerous diseases. However, FH information is not well captured in the structured database but often documented in clinical narratives. Natural language processing (NLP) is the key technology to extract patients’ FH from clinical narratives. In 2019, the National NLP Clinical Challenge (n2c2) organized shared tasks to solicit NLP methods for FH information extraction. Objective This study presents our end-to-end FH extraction system developed during the 2019 n2c2 open shared task as well as the new transformer-based models that we developed after the challenge. We seek to develop a machine learning–based solution for FH information extraction without task-specific rules created by hand. Methods We developed deep learning–based systems for FH concept extraction and relation identification. We explored deep learning models including long short-term memory-conditional random fields and bidirectional encoder representations from transformers (BERT) as well as developed ensemble models using a majority voting strategy. To further optimize performance, we systematically compared 3 different strategies to use BERT output representations for relation identification. Results Our system was among the top-ranked systems (3 out of 21) in the challenge. Our best system achieved micro-averaged F1 scores of 0.7944 and 0.6544 for concept extraction and relation identification, respectively. After challenge, we further explored new transformer-based models and improved the performances of both subtasks to 0.8249 and 0.6775, respectively. For relation identification, our system achieved a performance comparable to the best system (0.6810) reported in the challenge. Conclusions This study demonstrated the feasibility of utilizing deep learning methods to extract FH information from clinical narratives.

Download Full-text

A Text-Generated Method to Joint Extraction of Entities and Relations

Applied Sciences ◽

10.3390/app9183795 ◽

2019 ◽

Vol 9 (18) ◽

pp. 3795 ◽

Cited By ~ 3

Author(s):

Haihong E ◽

Siqi Xiao ◽

Meina Song

Keyword(s):

Language Processing ◽

Short Term Memory ◽

Relation Extraction ◽

Extraction Methods ◽

Short Term ◽

Basic Task ◽

Entity Relation Extraction ◽

Long Short Term Memory ◽

Lstm Network ◽

Public Datasets

Entity-relation extraction is a basic task in natural language processing, and recently, the use of deep-learning methods, especially the Long Short-Term Memory (LSTM) network, has achieved remarkable performance. However, most of the existing entity-relation extraction methods cannot solve the overlapped multi-relation extraction problem, which means one or two entities are shared among multiple relational triples contained in a sentence. In this paper, we propose a text-generated method to solve the overlapped problem of entity-relation extraction. Based on this, (1) the entities and their corresponding relations are jointly generated as target texts without any additional feature engineering; (2) the model directly generates the relational triples using a unified decoding process, and entities can be repeatedly presented in multiple triples to solve the overlapped-relation problem. We conduct experiments on two public datasets—NYT10 and NYT11. The experimental results show that our proposed method outperforms the existing work, and achieves the best results.

Download Full-text

Sentence similarity evaluation using Sent2Vec and siamese neural network with parallel structure

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189593 ◽

2021 ◽

pp. 1-10

Author(s):

Hye-Jeong Song ◽

Tak-Sung Heo ◽

Jong-Dae Kim ◽

Chan-Young Park ◽

Yu-Seop Kim

Keyword(s):

Neural Network ◽

Language Processing ◽

Short Term Memory ◽

Parallel Structure ◽

Short Term ◽

Similarity Estimation ◽

Accurate Judgment ◽

Proposed Model ◽

Sentence Similarity ◽

Long Short Term Memory

Sentence similarity evaluation is a significant task used in machine translation, classification, and information extraction in the field of natural language processing. When two sentences are given, an accurate judgment should be made whether the meaning of the sentences is equivalent even if the words and contexts of the sentences are different. To this end, existing studies have measured the similarity of sentences by focusing on the analysis of words, morphemes, and letters. To measure sentence similarity, this study uses Sent2Vec, a sentence embedding, as well as morpheme word embedding. Vectors representing words are input to the 1-dimension convolutional neural network (1D-CNN) with various sizes of kernels and bidirectional long short-term memory (Bi-LSTM). Self-attention is applied to the features transformed through Bi-LSTM. Subsequently, vectors undergoing 1D-CNN and self-attention are converted through global max pooling and global average pooling to extract specific values, respectively. The vectors generated through the above process are concatenated to the vector generated through Sent2Vec and are represented as a single vector. The vector is input to softmax layer, and finally, the similarity between the two sentences is determined. The proposed model can improve the accuracy by up to 5.42% point compared with the conventional sentence similarity estimation models.

Download Full-text

Using Bidirectional Long Short-Term Memory and Conditional Random Fields for Labeling Arabic Named Entities: A Comparative Study

2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS) ◽

10.1109/snams.2018.8554623 ◽

2018 ◽

Cited By ~ 1

Author(s):

Sa'a D A. Alzboun ◽

Saia Khaled Tawalbeh ◽

Mohammad Al-Smadi ◽

Yaser Jararweh

Keyword(s):

Comparative Study ◽

Random Fields ◽

Conditional Random Fields ◽

Short Term Memory ◽

Short Term ◽

Named Entities ◽

Term Memory ◽

Long Short Term Memory

Download Full-text

Bidirectional Long Short-Term Memory (BILSTM) with Conditional Random Fields (CRF) for Knowledge Named Entity Recognition in Online Judges (OJS)

International Journal on Natural Language Computing ◽

10.5121/ijnlc.2018.7401 ◽

2018 ◽

Vol 7 (4) ◽

pp. 01-08

Author(s):

Muhammad Asif Khan ◽

Tayyab Naveed ◽

Elmaam Yagoub ◽

Guojin Zhu

Keyword(s):

Random Fields ◽

Conditional Random Fields ◽

Short Term Memory ◽

Named Entity Recognition ◽

Entity Recognition ◽

Short Term ◽

Term Memory ◽

Named Entity ◽

Long Short Term Memory

Download Full-text

Ekstraksi Event Berbasis Paragraf dari Artikel Berita Bahasa Indonesia

10.31227/osf.io/9wgs2 ◽

2018 ◽

Author(s):

Masayu Leylia Khodra ◽

Yudi Wibisono

Keyword(s):

Random Fields ◽

Conditional Random Fields ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Sequence Labeling ◽

Long Short Term Memory ◽

Bahasa Indonesia

Dengan banyaknya artikel berita online yang terbit setiap saat, sistem ekstraksi event dapat membantu pembaca berita dengan memberikan informasi terstruktur dari setiap artikel berita. Ekstraksi event dari artikel berita merupakan proses mendapatkan informasi terstruktur 5W1H yaitu siapa (who) melakukan apa (what), kapan (when), dimana (where), mengapa (why), dan bagaimana (how). Ekstraksi 5W1H ini merupakan salah satu jenis ekstraksi informasi. Model ekstraksi 5W1H dibangun dengan pendekatan berbasis sequence labeling berbasis skema BIO (Begin Inside Outside). Karena setiap paragraf berisi satu pokok pikiran, idealnya satu instans frame 5W1H dihasilkan dari satu paragraf, dan satu artikel berita direpresentasikan dengan sejumlah instans frame 5W1H. Oleh karena itu, makalah ini membahas pembangunan model ekstraksi event 5W1H berbasis paragraf. Pemodelan dilakukan dengan menggunakan korpus 610 teks paragraf yang diambil dari 57 artikel berita yang telah dianotasi secara manual dengan informasi 5W1H. Pemodelan memanfaatkan arsitektur bidirectional LSTMs (long short term memory) dan CRF (conditional random fields). Pada tahap evaluasi, kinerja model yang dicapai adalah F1 0.62

Download Full-text

Deep Learning-Based Sentiment Analysis of COVID-19 Vaccination Responses from Twitter Data

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/4321131 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Kazi Nabiul Alam ◽

Md Shakib Khan ◽

Abdur Rab Dhruba ◽

Mohammad Monirujjaman Khan ◽

Jehad F. Al-Amri ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Performance Metrics ◽

Short Term Memory ◽

Confusion Matrix ◽

Short Term ◽

Learning Techniques ◽

The World ◽

Long Short Term Memory ◽

Severe Anxiety

The COVID-19 pandemic has had a devastating effect on many people, creating severe anxiety, fear, and complicated feelings or emotions. After the initiation of vaccinations against coronavirus, people’s feelings have become more diverse and complex. Our aim is to understand and unravel their sentiments in this research using deep learning techniques. Social media is currently the best way to express feelings and emotions, and with the help of Twitter, one can have a better idea of what is trending and going on in people’s minds. Our motivation for this research was to understand the diverse sentiments of people regarding the vaccination process. In this research, the timeline of the collected tweets was from December 21 to July21. The tweets contained information about the most common vaccines available recently from across the world. The sentiments of people regarding vaccines of all sorts were assessed using the natural language processing (NLP) tool, Valence Aware Dictionary for sEntiment Reasoner (VADER). Initializing the polarities of the obtained sentiments into three groups (positive, negative, and neutral) helped us visualize the overall scenario; our findings included 33.96% positive, 17.55% negative, and 48.49% neutral responses. In addition, we included our analysis of the timeline of the tweets in this research, as sentiments fluctuated over time. A recurrent neural network- (RNN-) oriented architecture, including long short-term memory (LSTM) and bidirectional LSTM (Bi-LSTM), was used to assess the performance of the predictive models, with LSTM achieving an accuracy of 90.59% and Bi-LSTM achieving 90.83%. Other performance metrics such as precision,, F1-score, and a confusion matrix were also used to validate our models and findings more effectively. This study improves understanding of the public’s opinion on COVID-19 vaccines and supports the aim of eradicating coronavirus from the world.

Download Full-text

Chinese Text Classification Model Based on Deep Learning

Future Internet ◽

10.3390/fi10110113 ◽

2018 ◽

Vol 10 (11) ◽

pp. 113 ◽

Cited By ~ 17

Author(s):

Yue Li ◽

Xutao Wang ◽

Pengjian Xu

Keyword(s):

Neural Network ◽

Deep Learning ◽

Language Processing ◽

Chinese Text ◽

Text Classification ◽

Short Term Memory ◽

Classification Model ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Text classification is of importance in natural language processing, as the massive text information containing huge amounts of value needs to be classified into different categories for further use. In order to better classify text, our paper tries to build a deep learning model which achieves better classification results in Chinese text than those of other researchers’ models. After comparing different methods, long short-term memory (LSTM) and convolutional neural network (CNN) methods were selected as deep learning methods to classify Chinese text. LSTM is a special kind of recurrent neural network (RNN), which is capable of processing serialized information through its recurrent structure. By contrast, CNN has shown its ability to extract features from visual imagery. Therefore, two layers of LSTM and one layer of CNN were integrated to our new model: the BLSTM-C model (BLSTM stands for bi-directional long short-term memory while C stands for CNN.) LSTM was responsible for obtaining a sequence output based on past and future contexts, which was then input to the convolutional layer for extracting features. In our experiments, the proposed BLSTM-C model was evaluated in several ways. In the results, the model exhibited remarkable performance in text classification, especially in Chinese texts.

Download Full-text