A Study of the Importance of External Knowledge in the Named Entity Recognition Task

This research focuses on a comparative study of the Named Entity Recognition task for scientific article texts. Natural language processing could be considered as one of the cornerstones in the machine learning area which devotes its attention to the problems connected with the understanding of different natural languages and linguistic analysis. It was already shown that current deep learning techniques have a good performance and accuracy in such areas as image recognition, pattern recognition, computer vision, that could mean that such technology probably would be successful in the neuro-linguistic programming area too and lead to a dramatic increase on the research interest on this topic. For a very long time, quite trivial algorithms have been used in this area, such as support vector machines or various types of regression, basic encoding on text data was also used, which did not provide high results. The following dataset was used to process the experiment models: Dataset Scientific Entity Relation Core. The algorithms used were Long short-term memory, Random Forest Classifier with Conditional Random Fields, and Named-entity recognition with Bidirectional Encoder Representations from Transformers. In the findings, the metrics scores of all models were compared to each other to make a comparison. This research is devoted to the processing of scientific articles, concerning the machine learning area, because the subject is not investigated on enough properly level.The consideration of this task can help machines to understand natural languages better, so that they can solve other neuro-linguistic programming tasks better, enhancing scores in common sense.

Download Full-text

When External Knowledge Does Not Aggregate in Named Entity Recognition

10.1007/978-3-030-91699-2_42 ◽

2021 ◽

pp. 616-627

Author(s):

Pedro Ivo Monteiro Privatto ◽

Ivan Rizzo Guilherme

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

External Knowledge ◽

Named Entity

Download Full-text

Information Extraction from Electronic Medical Records Using Multitask Recurrent Neural Network with Contextual Word Embedding

Applied Sciences ◽

10.3390/app9183658 ◽

2019 ◽

Vol 9 (18) ◽

pp. 3658 ◽

Cited By ~ 6

Author(s):

Jianliang Yang ◽

Yuenan Liu ◽

Minghui Qian ◽

Chenghua Guan ◽

Xiangfei Yuan

Keyword(s):

Electronic Medical Records ◽

Medical Records ◽

Large Scale ◽

Short Term Memory ◽

Conditional Random Field ◽

Named Entity Recognition ◽

Recognition Task ◽

Entity Recognition ◽

Language Models ◽

Named Entity

Clinical named entity recognition is an essential task for humans to analyze large-scale electronic medical records efficiently. Traditional rule-based solutions need considerable human effort to build rules and dictionaries; machine learning-based solutions need laborious feature engineering. For the moment, deep learning solutions like Long Short-term Memory with Conditional Random Field (LSTM–CRF) achieved considerable performance in many datasets. In this paper, we developed a multitask attention-based bidirectional LSTM–CRF (Att-biLSTM–CRF) model with pretrained Embeddings from Language Models (ELMo) in order to achieve better performance. In the multitask system, an additional task named entity discovery was designed to enhance the model’s perception of unknown entities. Experiments were conducted on the 2010 Informatics for Integrating Biology & the Bedside/Veterans Affairs (I2B2/VA) dataset. Experimental results show that our model outperforms the state-of-the-art solution both on the single model and ensemble model. Our work proposes an approach to improve the recall in the clinical named entity recognition task based on the multitask mechanism.

Download Full-text

Named entity recognition from spoken documents using global evidences and external knowledge sources with applications on Mandarin Chinese

IEEE Workshop on Automatic Speech Recognition and Understanding, 2005. ◽

10.1109/asru.2005.1566535 ◽

2005 ◽

Author(s):

Yi-cheng Pan ◽

Yu-ying Liu ◽

Lin-shan Lee

Keyword(s):

Mandarin Chinese ◽

Named Entity Recognition ◽

Entity Recognition ◽

Knowledge Sources ◽

External Knowledge ◽

Named Entity ◽

Spoken Documents

Download Full-text

A Sequence-to-Set Network for Nested Named Entity Recognition

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/542 ◽

2021 ◽

Author(s):

Zeqi Tan ◽

Yongliang Shen ◽

Shuai Zhang ◽

Weiming Lu ◽

Yueting Zhuang

Keyword(s):

Language Processing ◽

Named Entity Recognition ◽

Recognition Task ◽

Search Space ◽

Entity Recognition ◽

Bipartite Matching ◽

Named Entity ◽

Proposed Model ◽

Fixed Set ◽

Sequence Method

Named entity recognition (NER) is a widely studied task in natural language processing. Recently, a growing number of studies have focused on the nested NER. The span-based methods, considering the entity recognition as a span classification task, can deal with nested entities naturally. But they suffer from the huge search space and the lack of interactions between entities. To address these issues, we propose a novel sequence-to-set neural network for nested NER. Instead of specifying candidate spans in advance, we provide a fixed set of learnable vectors to learn the patterns of the valuable spans. We utilize a non-autoregressive decoder to predict the final set of entities in one pass, in which we are able to capture dependencies between entities. Compared with the sequence-to-sequence method, our model is more suitable for such unordered recognition task as it is insensitive to the label order. In addition, we utilize the loss function based on bipartite matching to compute the overall training loss. Experimental results show that our proposed model achieves state-of-the-art on three nested NER corpora: ACE 2004, ACE 2005 and KBP 2017. The code is available at https://github.com/zqtan1024/sequence-to-set.

Download Full-text

Assessment of DistilBERT performance on Named Entity Recognition task for the detection of Protected Health Information and medical concepts

10.18653/v1/2020.clinicalnlp-1.18 ◽

2020 ◽

Author(s):

Macarious Abadeer

Keyword(s):

Health Information ◽

Named Entity Recognition ◽

Recognition Task ◽

Entity Recognition ◽

Protected Health Information ◽

Named Entity ◽

Medical Concepts

Download Full-text

Uncertainty query sampling strategies for active learning of named entity recognition task

Intelligent Decision Technologies ◽

10.3233/idt-200048 ◽

2021 ◽

Vol 15 (1) ◽

pp. 99-114

Author(s):

Ankit Agrawal ◽

Sarsij Tripathi ◽

Manu Vardhan

Keyword(s):

Active Learning ◽

Learning Algorithm ◽

Named Entity Recognition ◽

Recognition Task ◽

Sampling Strategy ◽

Entity Recognition ◽

Learning Approaches ◽

Sampling Strategies ◽

Named Entity ◽

Final Probability

Active learning approach is well known method for labeling huge un-annotated dataset requiring minimal effort and is conducted in a cost efficient way. This approach selects and adds most informative instances to the training set iteratively such that the performance of learner improves with each iteration. Named entity recognition (NER) is a key task for information extraction in which entities present in sequences are labeled with correct class. The traditional query sampling strategies for the active learning only considers the final probability value of the model to select the most informative instances. In this paper, we have proposed a new active learning algorithm based on the hybrid query sampling strategy which also considers the sentence similarity along with the final probability value of the model and compared them with four other well known pool based uncertainty query sampling strategies based active learning approaches for named entity recognition (NER) i.e. least confident sampling, margin of confidence sampling, ratio of confidence sampling and entropy query sampling strategies. The experiments have been performed over three different biomedical NER datasets of different domains and a Spanish language NER dataset. We found that all the above approaches are able to reach to the performance of supervised learning based approach with much less annotated data requirement for training in comparison to that of supervised approach. The proposed active learning algorithm performs well and further reduces the annotation cost in comparison to the other sampling strategies based active algorithm in most of the cases.

Download Full-text

Chemical Entity Recognition and Resolution to ChEBI

ISRN Bioinformatics ◽

10.5402/2012/619427 ◽

2012 ◽

Vol 2012 ◽

pp. 1-9 ◽

Cited By ~ 10

Author(s):

Tiago Grego ◽

Catia Pesquita ◽

Hugo P. Bastos ◽

Francisco M. Couto

Keyword(s):

Machine Learning ◽

Named Entity Recognition ◽

Recognition Task ◽

Entity Resolution ◽

Chemical Entity ◽

Biomedical Literature ◽

Entity Recognition ◽

Named Entity ◽

Lexical Similarity ◽

Recognition Systems

Chemical entities are ubiquitous through the biomedical literature and the development of text-mining systems that can efficiently identify those entities are required. Due to the lack of available corpora and data resources, the community has focused its efforts in the development of gene and protein named entity recognition systems, but with the release of ChEBI and the availability of an annotated corpus, this task can be addressed. We developed a machine-learning-based method for chemical entity recognition and a lexical-similarity-based method for chemical entity resolution and compared them with Whatizit, a popular-dictionary-based method. Our methods outperformed the dictionary-based method in all tasks, yielding an improvement in F-measure of 20% for the entity recognition task, 2–5% for the entity-resolution task, and 15% for combined entity recognition and resolution tasks.

Download Full-text

Named entity recognition in texts with the help of part of speech tagging

Bulletin of Taras Shevchenko National University of Kyiv. Series: Physics and Mathematics ◽

10.17721/1812-5409.2018/4.11 ◽

2018 ◽

pp. 74-83

Author(s):

M. Bevza

Keyword(s):

State Of The Art ◽

Named Entity Recognition ◽

Recognition Task ◽

Entity Recognition ◽

Named Entity ◽

Part Of Speech Tagging ◽

Pos Tagging ◽

Part Of Speech ◽

Recent Developments ◽

Future Work

We analyze neural network architectures that yield state of the art results on named entity recognition task and propose a number of new architectures for improving results even further. We have analyzed a number of ideas and approaches that researchers have used to achieve state of the art results in a variety of NLP tasks. In this work, we present a few architectures which we consider to be most likely to improve the existing state of the art solutions for named entity recognition task and part of speech tasks. The architectures are inspired by recent developments in multi-task learning. This work tests the hypothesis that NER and POS are related tasks and adding information about POS tags as input to the network can help achieve better NER results. And vice versa, information about NER tags can help solve the task of POS tagging. This work also contains the implementation of the network and results of the experiments together with the conclusions and future work.

Download Full-text