Enhanced Meta-Learning for Cross-Lingual Named Entity Recognition with Minimal Resources

Qianhui Wu; Zijia Lin; Guoxin Wang; Hui Chen; Börje F. Karlsson; Biqing Huang; Chin-Yew Lin

doi:10.1609/aaai.v34i05.6466

Enhanced Meta-Learning for Cross-Lingual Named Entity Recognition with Minimal Resources

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6466 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9274-9281

Author(s):

Qianhui Wu ◽

Zijia Lin ◽

Guoxin Wang ◽

Hui Chen ◽

Börje F. Karlsson ◽

...

Keyword(s):

Learning Algorithm ◽

Named Entity Recognition ◽

Entity Recognition ◽

Target Language ◽

Test Case ◽

Named Entity ◽

Meta Learning ◽

Target Languages ◽

Cross Lingual ◽

The Given

For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER). While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which could benefit the prediction by leveraging the structural and semantic information conveyed in such similar examples. To this end, we present a meta-learning algorithm to find a good model parameter initialization that could fast adapt to the given test case and propose to construct multiple pseudo-NER tasks for meta-training by computing sentence similarities. To further improve the model's generalization ability across different languages, we introduce a masking scheme and augment the loss function with an additional maximum term during meta-training. We conduct extensive experiments on cross-lingual named entity recognition with minimal resources over five target languages. The results show that our approach significantly outperforms existing state-of-the-art methods across the board.

Download Full-text

UniTrans : Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/543 ◽

2020 ◽

Author(s):

Qianhui Wu ◽

Zijia Lin ◽

Börje F. Karlsson ◽

Biqing Huang ◽

Jian-Guang Lou

Keyword(s):

Data Transfer ◽

Named Entity Recognition ◽

Unlabeled Data ◽

Entity Recognition ◽

Target Language ◽

Context Information ◽

Prior Work ◽

Named Entity ◽

Model Transfer ◽

Cross Lingual

Prior work in cross-lingual named entity recognition (NER) with no/little labeled data falls into two primary categories: model transfer- and data transfer-based methods. In this paper, we find that both method types can complement each other, in the sense that, the former can exploit context information via language-independent features but sees no task-specific information in the target language; while the latter generally generates pseudo target-language training data via translation but its exploitation of context information is weakened by inaccurate translations. Moreover, prior work rarely leverages unlabeled data in the target language, which can be effortlessly collected and potentially contains valuable information for improved results. To handle both problems, we propose a novel approach termed UniTrans to Unify both model and data Transfer for cross-lingual NER, and furthermore, leverage the available information from unlabeled target-language data via enhanced knowledge distillation. We evaluate our proposed UniTrans over 4 target languages on benchmark datasets. Our experimental results show that it substantially outperforms the existing state-of-the-art methods.

Download Full-text

Test case Recommendation for regression with Named Entity Recognition for test step prediction

2021 4th Biennial International Conference on Nascent Technologies in Engineering (ICNTE) ◽

10.1109/icnte51185.2021.9487774 ◽

2021 ◽

Author(s):

Sebin Benny John ◽

Divyansh Gaur ◽

Amroz Siddiqui

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Test Case ◽

Named Entity

Download Full-text

Finding next of kin: Cross-lingual embedding spaces for related languages

Natural Language Engineering ◽

10.1017/s1351324919000354 ◽

2019 ◽

Vol 26 (2) ◽

pp. 163-182 ◽

Cited By ~ 1

Author(s):

Serge Sharoff

Keyword(s):

Similarity Measure ◽

Named Entity Recognition ◽

Entity Recognition ◽

Levenshtein Distance ◽

Next Of Kin ◽

Named Entity ◽

Genre Classification ◽

Lexical Similarity ◽

Cross Lingual ◽

Embedding Methods

AbstractSome languages have very few NLP resources, while many of them are closely related to better-resourced languages. This paper explores how the similarity between the languages can be utilised by porting resources from better- to lesser-resourced languages. The paper introduces a way of building a representation shared across related languages by combining cross-lingual embedding methods with a lexical similarity measure which is based on the weighted Levenshtein distance. One of the outcomes of the experiments is a Panslavonic embedding space for nine Balto-Slavonic languages. The paper demonstrates that the resulting embedding space helps in such applications as morphological prediction, named-entity recognition and genre classification.

Download Full-text

Evaluation of Named Entity Recognition Algorithms in Short Texts

CLEI electronic journal ◽

10.19153/cleiej.20.1.4 ◽

2017 ◽

Cited By ~ 1

Author(s):

Edgar Casasola Murillo ◽

Raquel Fonseca

Keyword(s):

Social Networks ◽

Sentiment Analysis ◽

Named Entity Recognition ◽

Entity Recognition ◽

Redes Sociales ◽

Named Entity ◽

Processing Strategies ◽

New Type ◽

Manual Processing ◽

The Given

Abstract: One of the major consequences of the growth of social networks has been the generation of huge volumes of content. The text that is generated in social networks constitutes a new type of content, that is short, informal, lacking grammar in some cases, and noise prone. Given the volume of information that is produced every day, a manual processing of this data is unpractical, causing the need of exploring and applying automatic processing strategies, like Entity Recognition (ER). It becomes necessary to evaluate the performance of traditional ER algorithms in corpus with those characteristics. This paper presents the results of applying AlchemyAPI y Dandelion API algorithms in a corpus provided by The SemEval-2015 Aspect Based Sentiment Analysis Conference. The entities recognized by each algorithm were compared against the ones annotated in the collection in order to calculate their precision and recall. Dandelion API got better results than AlchemyAPI with the given corpus. Spanish Abstract: Una de las principales consecuencias del auge actual de las redes sociales es la generación de grandes volúmenes de información. El texto generado en estas redes corresponde a un nuevo género de texto: corto, informal, gramaticalmente deficiente y propenso a ruido. Debido a la tasa de producción de la información, el procesamiento manual resulta poco práctico, surgiendo así la necesidad de aplicar estrategias de procesamiento automático, como Reconocimiento de Entidades (RE). Debido a las características del contenido, surge además la necesidad de evaluar el desempeño de los algoritmos tradicionales, en corpus extraídos de estas redes sociales. Este trabajo presenta los resultados obtenidos al aplicar los algoritmos de AlchemyAPI y Dandelion API en un corpus provisto por la conferencia The SemEval-2015 Aspect Based Sentiment Analysis. Las entidades reconocidas por cada algoritmo fueron comparadas con las anotadas en la colección, para calcular su precisión y exhaustividad. Dandelion API obtuvo mejores resultados que AlchemyAPI en el corpus dado.

Download Full-text

Biomedical Named Entity Recognition through a Multi-Agent Meta-Learning Framework

Chinese Journal of Computers ◽

10.3724/sp.j.1016..2010.01256 ◽

2010 ◽

Vol 33 (7) ◽

pp. 1256-1262 ◽

Cited By ~ 1

Author(s):

Hao-Chang WANG ◽

Yu LI ◽

Tie-Jun ZHAO

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Learning Framework ◽

Meta Learning ◽

Multi Agent ◽

Biomedical Named Entity Recognition

Download Full-text

Low Resource Named Entity Recognition Using Contextual Word Representation and Neural Cross-Lingual Knowledge Transfer

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-36708-4_25 ◽

2019 ◽

pp. 299-311

Author(s):

Soyeon Caren Han ◽

Yingru Lin ◽

Siqu Long ◽

Josiah Poon

Keyword(s):

Knowledge Transfer ◽

Named Entity Recognition ◽

Entity Recognition ◽

Low Resource ◽

Named Entity ◽

Word Representation ◽

Cross Lingual

Download Full-text

Uncertainty query sampling strategies for active learning of named entity recognition task

Intelligent Decision Technologies ◽

10.3233/idt-200048 ◽

2021 ◽

Vol 15 (1) ◽

pp. 99-114

Author(s):

Ankit Agrawal ◽

Sarsij Tripathi ◽

Manu Vardhan

Keyword(s):

Active Learning ◽

Learning Algorithm ◽

Named Entity Recognition ◽

Recognition Task ◽

Sampling Strategy ◽

Entity Recognition ◽

Learning Approaches ◽

Sampling Strategies ◽

Named Entity ◽

Final Probability

Active learning approach is well known method for labeling huge un-annotated dataset requiring minimal effort and is conducted in a cost efficient way. This approach selects and adds most informative instances to the training set iteratively such that the performance of learner improves with each iteration. Named entity recognition (NER) is a key task for information extraction in which entities present in sequences are labeled with correct class. The traditional query sampling strategies for the active learning only considers the final probability value of the model to select the most informative instances. In this paper, we have proposed a new active learning algorithm based on the hybrid query sampling strategy which also considers the sentence similarity along with the final probability value of the model and compared them with four other well known pool based uncertainty query sampling strategies based active learning approaches for named entity recognition (NER) i.e. least confident sampling, margin of confidence sampling, ratio of confidence sampling and entropy query sampling strategies. The experiments have been performed over three different biomedical NER datasets of different domains and a Spanish language NER dataset. We found that all the above approaches are able to reach to the performance of supervised learning based approach with much less annotated data requirement for training in comparison to that of supervised approach. The proposed active learning algorithm performs well and further reduces the annotation cost in comparison to the other sampling strategies based active algorithm in most of the cases.

Download Full-text

English-Korean Cross-lingual Link Discovery Using Link Probability and Named Entity Recognition

Journal of Korean institute of intelligent systems ◽

10.5391/jkiis.2013.23.3.191 ◽

2013 ◽

Vol 23 (3) ◽

pp. 191-195 ◽

Cited By ~ 3

Author(s):

Shin-Jae Kang

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Link Discovery ◽

Cross Lingual

Download Full-text

Neural Cross-Lingual Named Entity Recognition with Minimal Resources

10.18653/v1/d18-1034 ◽

2018 ◽

Cited By ~ 6

Author(s):

Jiateng Xie ◽

Zhilin Yang ◽

Graham Neubig ◽

Noah A. Smith ◽

Jaime Carbonell

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Cross Lingual

Download Full-text

Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection

10.18653/v1/p17-1135 ◽

2017 ◽

Cited By ~ 7

Author(s):

Jian Ni ◽

Georgiana Dinu ◽

Radu Florian

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Weakly Supervised ◽

Cross Lingual

Download Full-text