Document-Level Named Entity Recognition with Q-Network

Author(s):  
Tingming Lu ◽  
Yaocheng Gui ◽  
Zhiqiang Gao
2020 ◽  
Vol 34 (05) ◽  
pp. 8441-8448
Author(s):  
Ying Luo ◽  
Fengshun Xiao ◽  
Hai Zhao

Named entity recognition (NER) models are typically based on the architecture of Bi-directional LSTM (BiLSTM). The constraints of sequential nature and the modeling of single input prevent the full utilization of global information from larger scope, not only in the entire sentence, but also in the entire document (dataset). In this paper, we address these two deficiencies and propose a model augmented with hierarchical contextualized representation: sentence-level representation and document-level representation. In sentence-level, we take different contributions of words in a single sentence into consideration to enhance the sentence representation learned from an independent BiLSTM via label embedding attention mechanism. In document-level, the key-value memory network is adopted to record the document-aware information for each unique word which is sensitive to similarity of context information. Our two-level hierarchical contextualized representations are fused with each input token embedding and corresponding hidden state of BiLSTM, respectively. The experimental results on three benchmark NER datasets (CoNLL-2003 and Ontonotes 5.0 English datasets, CoNLL-2002 Spanish dataset) show that we establish new state-of-the-art results.


Author(s):  
Tao Gui ◽  
Jiacheng Ye ◽  
Qi Zhang ◽  
Yaqian Zhou ◽  
Yeyun Gong ◽  
...  

Document-level label consistency is an effective indicator that different occurrences of a particular token sequence are very likely to have the same entity types. Previous work focused on better context representations and used the CRF for label decoding. However, CRF-based methods are inadequate for modeling document-level label consistency. This work introduces a novel two-stage label refinement approach to handle document-level label consistency, where a key-value memory network is first used to record draft labels predicted by the base model, and then a multi-channel Transformer makes refinements on these draft predictions based on the explicit co-occurrence relationship derived from the memory network. In addition, in order to mitigate the side effects of incorrect draft labels, Bayesian neural networks are used to indicate the labels with a high probability of being wrong, which can greatly assist in preventing the incorrect refinement of correct draft labels. The experimental results on three named entity recognition benchmarks demonstrated that the proposed method significantly outperformed the state-of-the-art methods.


2020 ◽  
Vol 34 (05) ◽  
pp. 7961-7968
Author(s):  
Anwen Hu ◽  
Zhicheng Dou ◽  
Jian-Yun Nie ◽  
Ji-Rong Wen

Most state-of-the-art named entity recognition systems are designed to process each sentence within a document independently. These systems are easy to confuse entity types when the context information in a sentence is not sufficient enough. To utilize the context information within the whole document, most document-level work let neural networks on their own to learn the relation across sentences, which is not intuitive enough for us humans. In this paper, we divide entities to multi-token entities that contain multiple tokens and single-token entities that are composed of a single token. We propose that the context information of multi-token entities should be more reliable in document-level NER for news articles. We design a fusion attention mechanism which not only learns the semantic relevance between occurrences of the same token, but also focuses more on occurrences belonging to multi-tokens entities. To identify multi-token entities, we design an auxiliary task namely ‘Multi-token Entity Classification’ and perform this task simultaneously with document-level NER. This auxiliary task is simplified from NER and doesn't require extra annotation. Experimental results on the CoNLL-2003 dataset and OntoNotesnbm dataset show that our model outperforms state-of-the-art sentence-level and document-level NER methods.


2021 ◽  
Vol 336 ◽  
pp. 06021
Author(s):  
Hongshuai Liu ◽  
Ge Jun ◽  
Yuanyuan Zheng

Nowadays, most deep learning models ignore Chinese habits and global information when processing Chinese tasks. To solve this problem, we constructed the BERT-BiLSTM-Attention-CRF model. In the model, we embeded the BERT pre-training language model that adopts the Whole Word Mask strategy, and added a document-level attention. Experimental results show that our method achieves good results in the MSRA corpus, and F1 reaches 95.00%.


2017 ◽  
Vol 34 (8) ◽  
pp. 1381-1388 ◽  
Author(s):  
Ling Luo ◽  
Zhihao Yang ◽  
Pei Yang ◽  
Yin Zhang ◽  
Lei Wang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document