Multi-Level Representation Learning for Chinese Medical Entity Recognition: Model Development and Validation

Background Medical entity recognition is a key technology that supports the development of smart medicine. Existing methods on English medical entity recognition have undergone great development, but their progress in the Chinese language has been slow. Because of limitations due to the complexity of the Chinese language and annotated corpora, these methods are based on simple neural networks, which cannot effectively extract the deep semantic representations of electronic medical records (EMRs) and be used on the scarce medical corpora. We thus developed a new Chinese EMR (CEMR) dataset with six types of entities and proposed a multi-level representation learning model based on Bidirectional Encoder Representation from Transformers (BERT) for Chinese medical entity recognition. Objective This study aimed to improve the performance of the language model by having it learn multi-level representation and recognize Chinese medical entities. Methods In this paper, the pretraining language representation model was investigated; utilizing information not only from the final layer but from intermediate layers was found to affect the performance of the Chinese medical entity recognition task. Therefore, we proposed a multi-level representation learning model for entity recognition in Chinese EMRs. Specifically, we first used the BERT language model to extract semantic representations. Then, the multi-head attention mechanism was leveraged to automatically extract deeper semantic information from each layer. Finally, semantic representations from multi-level representation extraction were utilized as the final semantic context embedding for each token and we used softmax to predict the entity tags. Results The best F1 score reached by the experiment was 82.11% when using the CEMR dataset, and the F1 score when using the CCKS (China Conference on Knowledge Graph and Semantic Computing) 2018 benchmark dataset further increased to 83.18%. Various comparative experiments showed that our proposed method outperforms methods from previous work and performs as a new state-of-the-art method. Conclusions The multi-level representation learning model is proposed as a method to perform the Chinese EMRs entity recognition task. Experiments on two clinical datasets demonstrate the usefulness of using the multi-head attention mechanism to extract multi-level representation as part of the language model.

Download Full-text

Differentiated Attentive Representation Learning for Sentence Classification

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/644 ◽

2018 ◽

Cited By ~ 5

Author(s):

Qianrong Zhou ◽

Xiaojie Wang ◽

Xuan Dong

Keyword(s):

Representation Learning ◽

Learning Model ◽

Attention Mechanism ◽

Experimental Results ◽

Sentence Classification ◽

Synthetic Datasets

Attention-based models have shown to be effective in learning representations for sentence classification. They are typically equipped with multi-hop attention mechanism. However, existing multi-hop models still suffer from the problem of paying much attention to the most frequently noticed words, which might not be important to classify the current sentence. And there is a lack of explicitly effective way that helps the attention to be shifted out of a wrong part in the sentence. In this paper, we alleviate this problem by proposing a differentiated attentive learning model. It is composed of two branches of attention subnets and an example discriminator. An explicit signal with the loss information of the first attention subnet is passed on to the second one to drive them to learn different attentive preference. The example discriminator then selects the suitable attention subnet for sentence classification. Experimental results on real and synthetic datasets demonstrate the effectiveness of our model.

Download Full-text

Neural News Recommendation with Attentive Multi-View Learning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/536 ◽

2019 ◽

Cited By ~ 6

Author(s):

Chuhan Wu ◽

Fangzhao Wu ◽

Mingxiao An ◽

Jianqiang Huang ◽

Yongfeng Huang ◽

...

Keyword(s):

User Experience ◽

Real World ◽

Representation Learning ◽

Learning Model ◽

Online News ◽

Attention Mechanism ◽

The Core ◽

Word Level ◽

News Recommendation

Personalized news recommendation is very important for online news platforms to help users find interested news and improve user experience. News and user representation learning is critical for news recommendation. Existing news recommendation methods usually learn these representations based on single news information, e.g., title, which may be insufficient. In this paper we propose a neural news recommendation approach which can learn informative representations of users and news by exploiting different kinds of news information. The core of our approach is a news encoder and a user encoder. In the news encoder we propose an attentive multi-view learning model to learn unified news representations from titles, bodies and topic categories by regarding them as different views of news. In addition, we apply both word-level and view-level attention mechanism to news encoder to select important words and views for learning informative news representations. In the user encoder we learn the representations of users based on their browsed news and apply attention mechanism to select informative news for user representation learning. Extensive experiments on a real-world dataset show our approach can effectively improve the performance of news recommendation.

Download Full-text

Toward Sustainable Virtualized Healthcare: Extracting Medical Entities from Chinese Online Health Consultations Using Deep Neural Networks

Sustainability ◽

10.3390/su10093292 ◽

2018 ◽

Vol 10 (9) ◽

pp. 3292 ◽

Cited By ~ 5

Author(s):

Hangzhou Yang ◽

Huiying Gao

Keyword(s):

Neural Networks ◽

Disease Surveillance ◽

Deep Neural Networks ◽

Short Term Memory ◽

Recognition Task ◽

Healthcare Services ◽

Entity Recognition ◽

Context Word ◽

Entity Extraction ◽

Medical Entity

Increasingly popular virtualized healthcare services such as online health consultations have significantly changed the way in which health information is sought, and can alleviate geographic barriers, time constraints, and medical resource shortage problems. These online patient–doctor communications have been generating abundant amounts of healthcare-related data. Medical entity extraction from these data is the foundation of medical knowledge discovery, including disease surveillance and adverse drug reaction detection, which can potentially enhance the sustainability of healthcare. Previous studies that focus on health-related entity extraction have certain limitations such as demanding tough handcrafted feature engineering, failing to extract out-of-vocabulary entities, and being unsuitable for the Chinese social media context. Motivated by these observations, this study proposes a novel model named CNMER (Chinese Medical Entity Recognition) using deep neural networks for medical entity recognition in Chinese online health consultations. The designed model utilizes Bidirectional Long Short-Term Memory and Conditional Random Fields as the basic architecture, and uses character embedding and context word embedding to automatically learn effective features to recognize and classify medical-related entities. Exploiting the consultation text collected from a prevalent online health community in China, the evaluation results indicate that the proposed method significantly outperforms the related state-of-the-art models that focus on the Chinese medical entity recognition task. We expect that our model can contribute to the sustainable development of the virtualized healthcare industry.

Download Full-text

Medical Named Entity Extraction from Chinese Resident Admit Notes Using Character and Word Attention-Enhanced Neural Network

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17051614 ◽

2020 ◽

Vol 17 (5) ◽

pp. 1614

Author(s):

Yan Gao ◽

Yandong Wang ◽

Patrick Wang ◽

Lei Gu

Keyword(s):

Neural Network ◽

Electronic Medical Records ◽

Medical Records ◽

Attention Mechanism ◽

Entity Recognition ◽

Medical Decision ◽

Word Embeddings ◽

Entity Extraction ◽

Medical Entity ◽

Word Attention

The resident admit notes (RANs) in electronic medical records (EMRs) is first-hand information to study the patient’s condition. Medical entity extraction of RANs is an important task to get disease information for medical decision-making. For Chinese electronic medical records, each medical entity contains not only word information but also rich character information. Effective combination of words and characters is very important for medical entity extraction. We propose a medical entity recognition model based on a character and word attention-enhanced (CWAE) neural network for Chinese RANs. In our model, word embeddings and character-based embeddings are obtained through character-enhanced word embedding (CWE) model and Convolutional Neural Network (CNN) model. Then attention mechanism combines the character-based embeddings and word embeddings together, which significantly improves the expression ability of words. The new word embeddings obtained by the attention mechanism are taken as the input to bidirectional long short-term memory (BI-LSTM) and conditional random field (CRF) to extract entities. We extracted nine types of key medical entities from Chinese RANs and evaluated our model. The proposed method was compared with two traditional machine learning methods CRF, support vector machine (SVM), and the related deep learning models. The result shows that our model has better performance, and the result of our model reaches 94.44% in the F1-score.

Download Full-text

Incorporating multi-level CNN and attention mechanism for Chinese clinical named entity recognition

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2021.103737 ◽

2021 ◽

Vol 116 ◽

pp. 103737

Author(s):

Jun Kong ◽

Leixin Zhang ◽

Min Jiang ◽

Tianshan Liu

Keyword(s):

Named Entity Recognition ◽

Attention Mechanism ◽

Entity Recognition ◽

Named Entity ◽

Multi Level

Download Full-text

Medical Entity Recognition Based on BiLSTM with Knowledge Graph and Attention Mechanism

10.1109/icoias53694.2021.00035 ◽

2021 ◽

Author(s):

Qiaoling Wang ◽

Yu Liu ◽

Jinguang Gu ◽

Haidong Fu

Keyword(s):

Attention Mechanism ◽

Entity Recognition ◽

Knowledge Graph ◽

Medical Entity

Download Full-text

Attention-based Multi-level Feature Fusion for Named Entity Recognition

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/497 ◽

2020 ◽

Author(s):

Zhiwei Yang ◽

Hechang Chen ◽

Jiawei Zhang ◽

Jing Ma ◽

Yi Chang

Keyword(s):

Language Processing ◽

Feature Fusion ◽

Named Entity Recognition ◽

Representation Learning ◽

Entity Recognition ◽

Named Entity ◽

Local Character ◽

Word Level ◽

Benchmark Datasets ◽

Multi Level

Named entity recognition (NER) is a fundamental task in the natural language processing (NLP) area. Recently, representation learning methods (e.g., character embedding and word embedding) have achieved promising recognition results. However, existing models only consider partial features derived from words or characters while failing to integrate semantic and syntactic information (e.g., capitalization, inter-word relations, keywords, lexical phrases, etc.) from multi-level perspectives. Intuitively, multi-level features can be helpful when recognizing named entities from complex sentences. In this study, we propose a novel framework called attention-based multi-level feature fusion (AMFF), which is used to capture the multi-level features from different perspectives to improve NER. Our model consists of four components to respectively capture the local character-level, global character-level, local word-level, and global word-level features, which are then fed into a BiLSTM-CRF network for the final sequence labeling. Extensive experimental results on four benchmark datasets show that our proposed model outperforms a set of state-of-the-art baselines.

Download Full-text