scholarly journals Combining Multi-task Learning with Transfer Learning for Biomedical Named Entity Recognition

2020 ◽  
Vol 176 ◽  
pp. 848-857 ◽  
Author(s):  
Tahir Mehmood ◽  
Alfonso E. Gerevini ◽  
Alberto Lavelli ◽  
Ivan Serina
2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Zhaoying Chai ◽  
Han Jin ◽  
Shenghui Shi ◽  
Siyan Zhan ◽  
Lin Zhuo ◽  
...  

Abstract Background Biomedical named entity recognition (BioNER) is a basic and important medical information extraction task to extract medical entities with special meaning from medical texts. In recent years, deep learning has become the main research direction of BioNER due to its excellent data-driven context coding ability. However, in BioNER task, deep learning has the problem of poor generalization and instability. Results we propose the hierarchical shared transfer learning, which combines multi-task learning and fine-tuning, and realizes the multi-level information fusion between the underlying entity features and the upper data features. We select 14 datasets containing 4 types of entities for training and evaluate the model. The experimental results showed that the F1-scores of the five gold standard datasets BC5CDR-chemical, BC5CDR-disease, BC2GM, BC4CHEMD, NCBI-disease and LINNAEUS were increased by 0.57, 0.90, 0.42, 0.77, 0.98 and − 2.16 compared to the single-task XLNet-CRF model. BC5CDR-chemical, BC5CDR-disease and BC4CHEMD achieved state-of-the-art results.The reasons why LINNAEUS’s multi-task results are lower than single-task results are discussed at the dataset level. Conclusion Compared with using multi-task learning and fine-tuning alone, the model has more accurate recognition ability of medical entities, and has higher generalization and stability.


2020 ◽  
Vol 36 (15) ◽  
pp. 4331-4338
Author(s):  
Mei Zuo ◽  
Yang Zhang

Abstract Motivation Named entity recognition is a critical and fundamental task for biomedical text mining. Recently, researchers have focused on exploiting deep neural networks for biomedical named entity recognition (Bio-NER). The performance of deep neural networks on a single dataset mostly depends on data quality and quantity while high-quality data tends to be limited in size. To alleviate task-specific data limitation, some studies explored the multi-task learning (MTL) for Bio-NER and achieved state-of-the-art performance. However, these MTL methods did not make full use of information from various datasets of Bio-NER. The performance of state-of-the-art MTL method was significantly limited by the number of training datasets. Results We propose two dataset-aware MTL approaches for Bio-NER which jointly train all models for numerous Bio-NER datasets, thus each of these models could discriminatively exploit information from all of related training datasets. Both of our two approaches achieve substantially better performance compared with the state-of-the-art MTL method on 14 out of 15 Bio-NER datasets. Furthermore, we implemented our approaches by incorporating Bio-NER and biomedical part-of-speech (POS) tagging datasets. The results verify Bio-NER and POS can significantly enhance one another. Availability and implementation Our source code is available at https://github.com/zmmzGitHub/MTL-BC-LBC-BioNER and all datasets are publicly available at https://github.com/cambridgeltl/MTL-Bioinformatics-2016. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Vol 35 (10) ◽  
pp. 1745-1752 ◽  
Author(s):  
Xuan Wang ◽  
Yu Zhang ◽  
Xiang Ren ◽  
Yuhao Zhang ◽  
Marinka Zitnik ◽  
...  

2019 ◽  
Author(s):  
John Giorgi ◽  
Gary Bader

Motivation: Automatic biomedical named entity recognition (BioNER) is a key task in biomedical information extraction (IE). For some time, state-of-the-art BioNER has been dominated by machine learning methods, particularly conditional random fields (CRFs), with a recent focus on deep learning. However, recent work has suggested that the high performance of CRFs for BioNER may not generalize to corpora other than the one it was trained on. In our analysis, we find that a popular deep learning-based approach to BioNER, known as bidirectional long short-term memory network-conditional random field (BiLSTM-CRF), is correspondingly poor at generalizing - often dramatically overfitting the corpus it was trained on. To address this, we evaluate three modifications of BiLSTM-CRF for BioNER to alleviate overfitting and improve generalization: improved regularization via variational dropout, transfer learning, and multi-task learning. Results: We measure the effect that each strategy has when training/testing on the same corpus ("in-corpus" performance) and when training on one corpus and evaluating on another ("out-of-corpus" performance), our measure of the models ability to generalize. We found that variational dropout improves out-of-corpus performance by an average of 4.62%, transfer learning by 6.48% and multi-task learning by 8.42%. The maximal increase we identified combines multi-task learning and variational dropout, which boosts out-of-corpus performance by 10.75%. Furthermore, we make available a new open-source tool, called Saber, that implements our best BioNER models. Availability: Source code for our biomedical IE tool is available at https://github.com/BaderLab/saber. Corpora and other resources used in this study are available at https://github.com/BaderLab/Towards- reliable-BioNER.


2021 ◽  
Author(s):  
Lisa Langnickel ◽  
Juliane Fluck

Intense research has been done in the area of biomedical natural language processing. Since the breakthrough of transfer learning-based methods, BERT models are used in a variety of biomedical and clinical applications. For the available data sets, these models show excellent results - partly exceeding the inter-annotator agreements. However, biomedical named entity recognition applied on COVID-19 preprints shows a performance drop compared to the results on available test data. The question arises how well trained models are able to predict on completely new data, i.e. to generalize. Based on the example of disease named entity recognition, we investigate the robustness of different machine learning-based methods - thereof transfer learning - and show that current state-of-the-art methods work well for a given training and the corresponding test set but experience a significant lack of generalization when applying to new data. We therefore argue that there is a need for larger annotated data sets for training and testing.


Sign in / Sign up

Export Citation Format

Share Document