Models Distillation with Lifelong Deep Learning for Vietnamese Biomedical Named Entity Recognition

Abstract Background Biomedical named entity recognition (NER) is a fundamental task of biomedical text mining that finds the boundaries of entity mentions in biomedical text and determines their entity type. To accelerate the development of biomedical NER techniques in Spanish, the PharmaCoNER organizers launched a competition to recognize pharmacological substances, compounds, and proteins. Biomedical NER is usually recognized as a sequence labeling task, and almost all state-of-the-art sequence labeling methods ignore the meaning of different entity types. In this paper, we investigate some methods to introduce the meaning of entity types in deep learning methods for biomedical NER and apply them to the PharmaCoNER 2019 challenge. The meaning of each entity type is represented by its definition information. Material and method We investigate how to use entity definition information in the following two methods: (1) SQuad-style machine reading comprehension (MRC) methods that treat entity definition information as query and biomedical text as context and predict answer spans as entities. (2) Span-level one-pass (SOne) methods that predict entity spans of one type by one type and introduce entity type meaning, which is represented by entity definition information. All models are trained and tested on the PharmaCoNER 2019 corpus, and their performance is evaluated by strict micro-average precision, recall, and F1-score. Results Entity definition information brings improvements to both SQuad-style MRC and SOne methods by about 0.003 in micro-averaged F1-score. The SQuad-style MRC model using entity definition information as query achieves the best performance with a micro-averaged precision of 0.9225, a recall of 0.9050, and an F1-score of 0.9137, respectively. It outperforms the best model of the PharmaCoNER 2019 challenge by 0.0032 in F1-score. Compared with the state-of-the-art model without using manually-crafted features, our model obtains a 1% improvement in F1-score, which is significant. These results indicate that entity definition information is useful for deep learning methods on biomedical NER. Conclusion Our entity definition information enhanced models achieve the state-of-the-art micro-average F1 score of 0.9137, which implies that entity definition information has a positive impact on biomedical NER detection. In the future, we will explore more entity definition information from knowledge graph.

Download Full-text

Faculty Opinions recommendation of Deep learning with word embeddings improves biomedical named entity recognition.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.730927099.793537967 ◽

2017 ◽

Author(s):

Nigel Collier

Keyword(s):

Deep Learning ◽

Named Entity Recognition ◽

Entity Recognition ◽

Word Embeddings ◽

Named Entity ◽

Biomedical Named Entity Recognition

Download Full-text

Hierarchical shared transfer learning for biomedical named entity recognition

BMC Bioinformatics ◽

10.1186/s12859-021-04551-4 ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Zhaoying Chai ◽

Han Jin ◽

Shenghui Shi ◽

Siyan Zhan ◽

Lin Zhuo ◽

...

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Medical Information ◽

Named Entity Recognition ◽

Fine Tuning ◽

Entity Recognition ◽

Single Task ◽

Named Entity ◽

Task Learning ◽

Biomedical Named Entity Recognition

Abstract Background Biomedical named entity recognition (BioNER) is a basic and important medical information extraction task to extract medical entities with special meaning from medical texts. In recent years, deep learning has become the main research direction of BioNER due to its excellent data-driven context coding ability. However, in BioNER task, deep learning has the problem of poor generalization and instability. Results we propose the hierarchical shared transfer learning, which combines multi-task learning and fine-tuning, and realizes the multi-level information fusion between the underlying entity features and the upper data features. We select 14 datasets containing 4 types of entities for training and evaluate the model. The experimental results showed that the F1-scores of the five gold standard datasets BC5CDR-chemical, BC5CDR-disease, BC2GM, BC4CHEMD, NCBI-disease and LINNAEUS were increased by 0.57, 0.90, 0.42, 0.77, 0.98 and − 2.16 compared to the single-task XLNet-CRF model. BC5CDR-chemical, BC5CDR-disease and BC4CHEMD achieved state-of-the-art results.The reasons why LINNAEUS’s multi-task results are lower than single-task results are discussed at the dataset level. Conclusion Compared with using multi-task learning and fine-tuning alone, the model has more accurate recognition ability of medical entities, and has higher generalization and stability.

Download Full-text

Named Entity Recognition Method for Fault Knowledge based on Deep Learning

Proceedings of the 4th International Conference on Machine Learning and Soft Computing ◽

10.1145/3380688.3380690 ◽

2020 ◽

Author(s):

Zhicheng Chen ◽

Xiaobao Liu ◽

Yanchao Yin ◽

Hongbiao Lu

Keyword(s):

Deep Learning ◽

Named Entity Recognition ◽

Entity Recognition ◽

Recognition Method ◽

Named Entity ◽

Knowledge Based

Download Full-text