scholarly journals UniTrans : Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data

Author(s):  
Qianhui Wu ◽  
Zijia Lin ◽  
Börje F. Karlsson ◽  
Biqing Huang ◽  
Jian-Guang Lou

Prior work in cross-lingual named entity recognition (NER) with no/little labeled data falls into two primary categories: model transfer- and data transfer-based methods. In this paper, we find that both method types can complement each other, in the sense that, the former can exploit context information via language-independent features but sees no task-specific information in the target language; while the latter generally generates pseudo target-language training data via translation but its exploitation of context information is weakened by inaccurate translations. Moreover, prior work rarely leverages unlabeled data in the target language, which can be effortlessly collected and potentially contains valuable information for improved results. To handle both problems, we propose a novel approach termed UniTrans to Unify both model and data Transfer for cross-lingual NER, and furthermore, leverage the available information from unlabeled target-language data via enhanced knowledge distillation. We evaluate our proposed UniTrans over 4 target languages on benchmark datasets. Our experimental results show that it substantially outperforms the existing state-of-the-art methods.

2020 ◽  
Vol 34 (05) ◽  
pp. 9274-9281
Author(s):  
Qianhui Wu ◽  
Zijia Lin ◽  
Guoxin Wang ◽  
Hui Chen ◽  
Börje F. Karlsson ◽  
...  

For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER). While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which could benefit the prediction by leveraging the structural and semantic information conveyed in such similar examples. To this end, we present a meta-learning algorithm to find a good model parameter initialization that could fast adapt to the given test case and propose to construct multiple pseudo-NER tasks for meta-training by computing sentence similarities. To further improve the model's generalization ability across different languages, we introduce a masking scheme and augment the loss function with an additional maximum term during meta-training. We conduct extensive experiments on cross-lingual named entity recognition with minimal resources over five target languages. The results show that our approach significantly outperforms existing state-of-the-art methods across the board.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257230
Author(s):  
Huijiong Yan ◽  
Tao Qian ◽  
Liang Xie ◽  
Shanguang Chen

Named entity recognition (NER) is one fundamental task in the natural language processing (NLP) community. Supervised neural network models based on contextualized word representations can achieve highly-competitive performance, which requires a large-scale manually-annotated corpus for training. While for the resource-scarce languages, the construction of such as corpus is always expensive and time-consuming. Thus, unsupervised cross-lingual transfer is one good solution to address the problem. In this work, we investigate the unsupervised cross-lingual NER with model transfer based on contextualized word representations, which greatly advances the cross-lingual NER performance. We study several model transfer settings of the unsupervised cross-lingual NER, including (1) different types of the pretrained transformer-based language models as input, (2) the exploration strategies of the multilingual contextualized word representations, and (3) multi-source adaption. In particular, we propose an adapter-based word representation method combining with parameter generation network (PGN) better to capture the relationship between the source and target languages. We conduct experiments on a benchmark ConLL dataset involving four languages to simulate the cross-lingual setting. Results show that we can obtain highly-competitive performance by cross-lingual model transfer. In particular, our proposed adapter-based PGN model can lead to significant improvements for cross-lingual NER.


2019 ◽  
Vol 26 (2) ◽  
pp. 163-182 ◽  
Author(s):  
Serge Sharoff

AbstractSome languages have very few NLP resources, while many of them are closely related to better-resourced languages. This paper explores how the similarity between the languages can be utilised by porting resources from better- to lesser-resourced languages. The paper introduces a way of building a representation shared across related languages by combining cross-lingual embedding methods with a lexical similarity measure which is based on the weighted Levenshtein distance. One of the outcomes of the experiments is a Panslavonic embedding space for nine Balto-Slavonic languages. The paper demonstrates that the resulting embedding space helps in such applications as morphological prediction, named-entity recognition and genre classification.


Symmetry ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 1986
Author(s):  
Liguo Yao ◽  
Haisong Huang ◽  
Kuan-Wei Wang ◽  
Shih-Huan Chen ◽  
Qiaoqiao Xiong

Manufacturing text often exists as unlabeled data; the entity is fine-grained and the extraction is difficult. The above problems mean that the manufacturing industry knowledge utilization rate is low. This paper proposes a novel Chinese fine-grained NER (named entity recognition) method based on symmetry lightweight deep multinetwork collaboration (ALBERT-AttBiLSTM-CRF) and model transfer considering active learning (MTAL) to research fine-grained named entity recognition of a few labeled Chinese textual data types. The method is divided into two stages. In the first stage, the ALBERT-AttBiLSTM-CRF was applied for verification in the CLUENER2020 dataset (Public dataset) to get a pretrained model; the experiments show that the model obtains an F1 score of 0.8962, which is better than the best baseline algorithm, an improvement of 9.2%. In the second stage, the pretrained model was transferred into the Manufacturing-NER dataset (our dataset), and we used the active learning strategy to optimize the model effect. The final F1 result of Manufacturing-NER was 0.8931 after the model transfer (it was higher than 0.8576 before the model transfer); so, this method represents an improvement of 3.55%. Our method effectively transfers the existing knowledge from public source data to scientific target data, solving the problem of named entity recognition with scarce labeled domain data, and proves its effectiveness.


2020 ◽  
Vol 34 (05) ◽  
pp. 8480-8487
Author(s):  
Stephen Mayhew ◽  
Gupta Nitish ◽  
Dan Roth

Although modern named entity recognition (NER) systems show impressive performance on standard datasets, they perform poorly when presented with noisy data. In particular, capitalization is a strong signal for entities in many languages, and even state of the art models overfit to this feature, with drastically lower performance on uncapitalized text. In this work, we address the problem of robustness of NER systems in data with noisy or uncertain casing, using a pretraining objective that predicts casing in text, or a truecaser, leveraging unlabeled data. The pretrained truecaser is combined with a standard BiLSTM-CRF model for NER by appending output distributions to character embeddings. In experiments over several datasets of varying domain and casing quality, we show that our new model improves performance in uncased text, even adding value to uncased BERT embeddings. Our method achieves a new state of the art on the WNUT17 shared task dataset.


2014 ◽  
Vol 37 (1) ◽  
pp. 1-22 ◽  
Author(s):  
Asif Ekbal ◽  
Sivaji Bandyopadhyay

This paper reports a voted Named Entity Recognition (NER) system that exploits appropriate unlabeled data. Initially, we develop NER systems using the supervised machine learning algorithms such as Maximum Entropy (ME), Conditional Random Field (CRF) and Support Vector Machine (SVM). Each of these models makes use of the language independent features in the form of different contextual and orthographic word-level features along with the language dependent features extracted from the Part-of-Speech (POS) tagger and gazetteers. Context patterns generated from the unlabeled data using an active learning method are also used as the features in each of the classifiers. A semi-supervised method is proposed to describe the measures to automatically select effective unlabeled documents as well as sentences from the unlabeled data. Finally, the supervised models are combined together into a final system by defining appropriate weighted voting technique. Experimental results for a resource-poor language like Bengali show the effectiveness of the proposed approach with the overall recall, precision and F-measure values of 93.81%, 92.18% and 92.98%, respectively.


2018 ◽  
Author(s):  
Jiateng Xie ◽  
Zhilin Yang ◽  
Graham Neubig ◽  
Noah A. Smith ◽  
Jaime Carbonell

Sign in / Sign up

Export Citation Format

Share Document