Zero-Resource Cross-Lingual Named Entity Recognition

Recently, neural methods have achieved state-of-the-art (SOTA) results in Named Entity Recognition (NER) tasks for many languages without the need for manually crafted features. However, these models still require manually annotated training data, which is not available for many languages. In this paper, we propose an unsupervised cross-lingual NER model that can transfer NER knowledge from one language to another in a completely unsupervised way without relying on any bilingual dictionary or parallel data. Our model achieves this through word-level adversarial learning and augmented fine-tuning with parameter sharing and feature augmentation. Experiments on five different languages demonstrate the effectiveness of our approach, outperforming existing models by a good margin and setting a new SOTA for each language pair.

Download Full-text

Parameter Space Factorization for Zero-Shot Learning across Tasks and Languages

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00374 ◽

2021 ◽

Vol 9 ◽

pp. 410-428

Author(s):

Edoardo M. Ponti ◽

Ivan Vulić ◽

Ryan Cotterell ◽

Marinela Parovic ◽

Roi Reichart ◽

...

Keyword(s):

Latent Variables ◽

Named Entity Recognition ◽

Training Data ◽

Entity Recognition ◽

Language Varieties ◽

Named Entity ◽

Shot Classification ◽

Pos Tagging ◽

Part Of Speech ◽

Cross Lingual

Abstract Most combinations of NLP tasks and language varieties lack in-domain examples for supervised training because of the paucity of annotated data. How can neural models make sample-efficient generalizations from task–language combinations with available data to low-resource ones? In this work, we propose a Bayesian generative model for the space of neural parameters. We assume that this space can be factorized into latent variables for each language and each task. We infer the posteriors over such latent variables based on data from seen task–language combinations through variational inference. This enables zero-shot classification on unseen combinations at prediction time. For instance, given training data for named entity recognition (NER) in Vietnamese and for part-of-speech (POS) tagging in Wolof, our model can perform accurate predictions for NER in Wolof. In particular, we experiment with a typologically diverse sample of 33 languages from 4 continents and 11 families, and show that our model yields comparable or better results than state-of-the-art, zero-shot cross-lingual transfer methods. Our code is available at github.com/cambridgeltl/parameter-factorization.

Download Full-text

A Multichannel Biomedical Named Entity Recognition Model Based on Multitask Learning and Contextualized Word Representations

Wireless Communications and Mobile Computing ◽

10.1155/2020/8894760 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13 ◽

Cited By ~ 1

Author(s):

Hao Wei ◽

Mingyuan Gao ◽

Ai Zhou ◽

Fei Chen ◽

Wen Qu ◽

...

Keyword(s):

Conditional Random Field ◽

Named Entity Recognition ◽

Multitask Learning ◽

Biomedical Literature ◽

Training Data ◽

Entity Recognition ◽

Language Models ◽

Named Entity ◽

Word Level ◽

Biomedical Named Entity Recognition

As the biomedical literature increases exponentially, biomedical named entity recognition (BNER) has become an important task in biomedical information extraction. In the previous studies based on deep learning, pretrained word embedding becomes an indispensable part of the neural network models, effectively improving their performance. However, the biomedical literature typically contains numerous polysemous and ambiguous words. Using fixed pretrained word representations is not appropriate. Therefore, this paper adopts the pretrained embeddings from language models (ELMo) to generate dynamic word embeddings according to context. In addition, in order to avoid the problem of insufficient training data in specific fields and introduce richer input representations, we propose a multitask learning multichannel bidirectional gated recurrent unit (BiGRU) model. Multiple feature representations (e.g., word-level, contextualized word-level, character-level) are, respectively, or collectively fed into the different channels. Manual participation and feature engineering can be avoided through automatic capturing features in BiGRU. In merge layer, multiple methods are designed to integrate the outputs of multichannel BiGRU. We combine BiGRU with the conditional random field (CRF) to address labels’ dependence in sequence labeling. Moreover, we introduce the auxiliary corpora with same entity types for the main corpora to be evaluated in multitask learning framework, then train our model on these separate corpora and share parameters with each other. Our model obtains promising results on the JNLPBA and NCBI-disease corpora, with F1-scores of 76.0% and 88.7%, respectively. The latter achieves the best performance among reported existing feature-based models.

Download Full-text

Named Entity Recognition in Chinese Medical Literature Using Pretraining Models

Scientific Programming ◽

10.1155/2020/8812754 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Yu Wang ◽

Yining Sun ◽

Zuchang Ma ◽

Lisheng Gao ◽

Yang Xu

Keyword(s):

Large Scale ◽

Data Augmentation ◽

Medical Literature ◽

Named Entity Recognition ◽

Semantic Knowledge ◽

Training Data ◽

Fine Tuning ◽

Entity Recognition ◽

Small Scale ◽

Named Entity

The medical literature contains valuable knowledge, such as the clinical symptoms, diagnosis, and treatments of a particular disease. Named Entity Recognition (NER) is the initial step in extracting this knowledge from unstructured text and presenting it as a Knowledge Graph (KG). However, the previous approaches of NER have often suffered from small-scale human-labelled training data. Furthermore, extracting knowledge from Chinese medical literature is a more complex task because there is no segmentation between Chinese characters. Recently, the pretraining models, which obtain representations with the prior semantic knowledge on large-scale unlabelled corpora, have achieved state-of-the-art results for a wide variety of Natural Language Processing (NLP) tasks. However, the capabilities of pretraining models have not been fully exploited, and applications of other pretraining models except BERT in specific domains, such as NER in Chinese medical literature, are also of interest. In this paper, we enhance the performance of NER in Chinese medical literature using pretraining models. First, we propose a method of data augmentation by replacing the words in the training set with synonyms through the Mask Language Model (MLM), which is a pretraining task. Then, we consider NER as the downstream task of the pretraining model and transfer the prior semantic knowledge obtained during pretraining to it. Finally, we conduct experiments to compare the performances of six pretraining models (BERT, BERT-WWM, BERT-WWM-EXT, ERNIE, ERNIE-tiny, and RoBERTa) in recognizing named entities from Chinese medical literature. The effects of feature extraction and fine-tuning, as well as different downstream model structures, are also explored. Experimental results demonstrate that the method of data augmentation we proposed can obtain meaningful improvements in the performance of recognition. Besides, RoBERTa-CRF achieves the highest F1-score compared with the previous methods and other pretraining models.

Download Full-text

Learning Task-Specific Representation for Novel Words in Sequence Labeling

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/715 ◽

2019 ◽

Author(s):

Minlong Peng ◽

Qi Zhang ◽

Xiaoyu Xing ◽

Tao Gui ◽

Jinlan Fu ◽

...

Keyword(s):

Empirical Studies ◽

Named Entity Recognition ◽

Learning Task ◽

Training Data ◽

Entity Recognition ◽

Named Entity ◽

Part Of Speech Tagging ◽

Sequence Labeling ◽

Part Of Speech ◽

Word Representation

Word representation is a key component in neural-network-based sequence labeling systems. However, representations of unseen or rare words trained on the end task are usually poor for appreciable performance. This is commonly referred to as the out-of-vocabulary (OOV) problem. In this work, we address the OOV problem in sequence labeling using only training data of the task. To this end, we propose a novel method to predict representations for OOV words from their surface-forms (e.g., character sequence) and contexts. The method is specifically designed to avoid the error propagation problem suffered by existing approaches in the same paradigm. To evaluate its effectiveness, we performed extensive empirical studies on four part-of-speech tagging (POS) tasks and four named entity recognition (NER) tasks. Experimental results show that the proposed method can achieve better or competitive performance on the OOV problem compared with existing state-of-the-art methods.

Download Full-text

BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition

10.21203/rs.3.rs-90025/v1 ◽

2020 ◽

Author(s):

Usman Naseem ◽

Matloob Khushi ◽

Vinay Reddy ◽

Sakthivel Rajendran ◽

Imran Razzak ◽

...

Keyword(s):

State Of The Art ◽

Language Model ◽

Named Entity Recognition ◽

Training Data ◽

Entity Recognition ◽

Future Research ◽

Named Entity ◽

Domain Specific ◽

Context Dependent ◽

Biomedical Named Entity Recognition

Abstract Background: In recent years, with the growing amount of biomedical documents, coupled with advancement in natural language processing algorithms, the research on biomedical named entity recognition (BioNER) has increased exponentially. However, BioNER research is challenging as NER in the biomedical domain are: (i) often restricted due to limited amount of training data, (ii) an entity can refer to multiple types and concepts depending on its context and, (iii) heavy reliance on acronyms that are sub-domain specific. Existing BioNER approaches often neglect these issues and directly adopt the state-of-the-art (SOTA) models trained in general corpora which often yields unsatisfactory results. Results: We propose biomedical ALBERT (A Lite Bidirectional Encoder Representations from Transformers for Biomedical Text Mining) - bioALBERT - an effective domain-specific pre-trained language model trained on huge biomedical corpus designed to capture biomedical context-dependent NER. We adopted self-supervised loss function used in ALBERT that targets on modelling inter-sentence coherence to better learn context-dependent representations and incorporated parameter reduction strategies to minimise memory usage and enhance the training time in BioNER. In our experiments, BioALBERT outperformed comparative SOTA BioNER models on eight biomedical NER benchmark datasets with four different entity types. The performance is increased for; (i) disease type corpora by 7.47% (NCBI-disease) and 10.63% (BC5CDR-disease); (ii) drug-chem type corpora by 4.61% (BC5CDR-Chem) and 3.89 (BC4CHEMD); (iii) gene-protein type corpora by 12.25% (BC2GM) and 6.42% (JNLPBA); and (iv) Species type corpora by 6.19% (LINNAEUS) and 23.71% (Species-800) is observed which leads to a state-of-the-art results. Conclusions: The performance of proposed model on four different biomedical entity types shows that our model is robust and generalizable in recognizing biomedical entities in text. We trained four different variants of BioALBERT models which are available for the research community to be used in future research.

Download Full-text

End-to-End Recurrent Neural Network Models for Vietnamese Named Entity Recognition: Word-Level Vs. Character-Level

Communications in Computer and Information Science - Computational Linguistics ◽

10.1007/978-981-10-8438-6_18 ◽

2018 ◽

pp. 219-232 ◽

Cited By ~ 5

Author(s):

Thai-Hoang Pham ◽

Phuong Le-Hong

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Named Entity Recognition ◽

Network Models ◽

Entity Recognition ◽

Neural Network Models ◽

Named Entity ◽

Word Level ◽

End To End

Download Full-text

A Federated Adversarial Learning Method for Biomedical Named Entity Recognition

10.1109/bibm52615.2021.9669728 ◽

2021 ◽

Author(s):

Hanyu Zhao ◽

Sha Yuan ◽

Niantao Xie ◽

Jiahong Leng ◽

Guoqiang Wang

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Learning Method ◽

Adversarial Learning ◽

Named Entity ◽

Biomedical Named Entity Recognition

Download Full-text

Enhanced Meta-Learning for Cross-Lingual Named Entity Recognition with Minimal Resources

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6466 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9274-9281

Author(s):

Qianhui Wu ◽

Zijia Lin ◽

Guoxin Wang ◽

Hui Chen ◽

Börje F. Karlsson ◽

...

Keyword(s):

Learning Algorithm ◽

Named Entity Recognition ◽

Entity Recognition ◽

Target Language ◽

Test Case ◽

Named Entity ◽

Meta Learning ◽

Target Languages ◽

Cross Lingual ◽

The Given

For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER). While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which could benefit the prediction by leveraging the structural and semantic information conveyed in such similar examples. To this end, we present a meta-learning algorithm to find a good model parameter initialization that could fast adapt to the given test case and propose to construct multiple pseudo-NER tasks for meta-training by computing sentence similarities. To further improve the model's generalization ability across different languages, we introduce a masking scheme and augment the loss function with an additional maximum term during meta-training. We conduct extensive experiments on cross-lingual named entity recognition with minimal resources over five target languages. The results show that our approach significantly outperforms existing state-of-the-art methods across the board.

Download Full-text

Chinese Clinical Named Entity Recognition with Word-Level Information Incorporating Dictionaries

2019 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2019.8852113 ◽

2019 ◽

Cited By ~ 2

Author(s):

Ningjie Lu ◽

Jun Zheng ◽

Wen Wu ◽

Yan Yang ◽

Kaiwei Chen ◽

...

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Word Level ◽

Level Information

Download Full-text

Finding next of kin: Cross-lingual embedding spaces for related languages

Natural Language Engineering ◽

10.1017/s1351324919000354 ◽

2019 ◽

Vol 26 (2) ◽

pp. 163-182 ◽

Cited By ~ 1

Author(s):

Serge Sharoff

Keyword(s):

Similarity Measure ◽

Named Entity Recognition ◽

Entity Recognition ◽

Levenshtein Distance ◽

Next Of Kin ◽

Named Entity ◽

Genre Classification ◽

Lexical Similarity ◽

Cross Lingual ◽

Embedding Methods

AbstractSome languages have very few NLP resources, while many of them are closely related to better-resourced languages. This paper explores how the similarity between the languages can be utilised by porting resources from better- to lesser-resourced languages. The paper introduces a way of building a representation shared across related languages by combining cross-lingual embedding methods with a lexical similarity measure which is based on the weighted Levenshtein distance. One of the outcomes of the experiments is a Panslavonic embedding space for nine Balto-Slavonic languages. The paper demonstrates that the resulting embedding space helps in such applications as morphological prediction, named-entity recognition and genre classification.

Download Full-text