Document-Level Named Entity Recognition with Q-Network

Named entity recognition (NER) models are typically based on the architecture of Bi-directional LSTM (BiLSTM). The constraints of sequential nature and the modeling of single input prevent the full utilization of global information from larger scope, not only in the entire sentence, but also in the entire document (dataset). In this paper, we address these two deficiencies and propose a model augmented with hierarchical contextualized representation: sentence-level representation and document-level representation. In sentence-level, we take different contributions of words in a single sentence into consideration to enhance the sentence representation learned from an independent BiLSTM via label embedding attention mechanism. In document-level, the key-value memory network is adopted to record the document-aware information for each unique word which is sensitive to similarity of context information. Our two-level hierarchical contextualized representations are fused with each input token embedding and corresponding hidden state of BiLSTM, respectively. The experimental results on three benchmark NER datasets (CoNLL-2003 and Ontonotes 5.0 English datasets, CoNLL-2002 Spanish dataset) show that we establish new state-of-the-art results.

Download Full-text

Leveraging Document-Level Label Consistency for Named Entity Recognition

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/550 ◽

2020 ◽

Author(s):

Tao Gui ◽

Jiacheng Ye ◽

Qi Zhang ◽

Yaqian Zhou ◽

Yeyun Gong ◽

...

Keyword(s):

Neural Networks ◽

Side Effects ◽

High Probability ◽

State Of The Art ◽

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Memory Network ◽

Document Level ◽

Token Sequence

Document-level label consistency is an effective indicator that different occurrences of a particular token sequence are very likely to have the same entity types. Previous work focused on better context representations and used the CRF for label decoding. However, CRF-based methods are inadequate for modeling document-level label consistency. This work introduces a novel two-stage label refinement approach to handle document-level label consistency, where a key-value memory network is first used to record draft labels predicted by the base model, and then a multi-channel Transformer makes refinements on these draft predictions based on the explicit co-occurrence relationship derived from the memory network. In addition, in order to mitigate the side effects of incorrect draft labels, Bayesian neural networks are used to indicate the labels with a high probability of being wrong, which can greatly assist in preventing the incorrect refinement of correct draft labels. The experimental results on three named entity recognition benchmarks demonstrated that the proposed method significantly outperformed the state-of-the-art methods.

Download Full-text

Leveraging Multi-Token Entities in Document-Level Named Entity Recognition

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6304 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7961-7968

Author(s):

Anwen Hu ◽

Zhicheng Dou ◽

Jian-Yun Nie ◽

Ji-Rong Wen

Keyword(s):

Neural Networks ◽

State Of The Art ◽

Named Entity Recognition ◽

Entity Recognition ◽

Context Information ◽

Named Entity ◽

Sentence Level ◽

Semantic Relevance ◽

Recognition Systems ◽

Document Level

Most state-of-the-art named entity recognition systems are designed to process each sentence within a document independently. These systems are easy to confuse entity types when the context information in a sentence is not sufficient enough. To utilize the context information within the whole document, most document-level work let neural networks on their own to learn the relation across sentences, which is not intuitive enough for us humans. In this paper, we divide entities to multi-token entities that contain multiple tokens and single-token entities that are composed of a single token. We propose that the context information of multi-token entities should be more reliable in document-level NER for news articles. We design a fusion attention mechanism which not only learns the semantic relevance between occurrences of the same token, but also focuses more on occurrences belonging to multi-tokens entities. To identify multi-token entities, we design an auxiliary task namely ‘Multi-token Entity Classification’ and perform this task simultaneously with document-level NER. This auxiliary task is simplified from NER and doesn't require extra annotation. Experimental results on the CoNLL-2003 dataset and OntoNotesnbm dataset show that our model outperforms state-of-the-art sentence-level and document-level NER methods.

Download Full-text

Chinese named entity recognition model based on BERT

MATEC Web of Conferences ◽

10.1051/matecconf/202133606021 ◽

2021 ◽

Vol 336 ◽

pp. 06021

Author(s):

Hongshuai Liu ◽

Ge Jun ◽

Yuanyuan Zheng

Keyword(s):

Deep Learning ◽

Language Model ◽

Named Entity Recognition ◽

Experimental Results ◽

Entity Recognition ◽

Global Information ◽

Learning Models ◽

Named Entity ◽

Whole Word ◽

Document Level

Nowadays, most deep learning models ignore Chinese habits and global information when processing Chinese tasks. To solve this problem, we constructed the BERT-BiLSTM-Attention-CRF model. In the model, we embeded the BERT pre-training language model that adopts the Whole Word Mask strategy, and added a document-level attention. Experimental results show that our method achieves good results in the MSRA corpus, and F1 reaches 95.00%.

Download Full-text