Constrained Labeled Data Generation for Low-Resource Named Entity Recognition

Training Data ◽

Entity Recognition ◽

Linguistic Knowledge ◽

Rule Based ◽

Low Resource ◽

Named Entity ◽

The North ◽

Rule Based Approach

Kokborok named entity recognition using the rules based approach is being studied in this paper. Named entity recognition is one of the applications of natural language processing. It is considered a subtask for information extraction. Named entity recognition is the means of identifying the named entity for some specific task. We have studied the named entity recognition system for the Kokborok language. Kokborok is the official language of the state of Tripura situated in the north eastern part of India. It is also widely spoken in other part of the north eastern state of India and adjoining areas of Bangladesh. The named entities are like the name of person, organization, location etc. Named entity recognitions are studied using the machine learning approach, rule based approach or the hybrid approach combining the machine learning and rule based approaches. Rule based named entity recognitions are influence by the linguistic knowledge of the language. Machine learning approach requires a large number of training data. Kokborok being a low resource language has very limited number of training data. The rule based approach requires linguistic rules and the results are not depended on the size of data available. We have framed a heuristic rules for identifying the named entity based on linguistic knowledge of the language. An encouraging result is obtained after we test our data with the rule based approach. We also tried to study and frame the rules for the counting system in Kokborok in this paper. The rule based approach to named entity recognition is found suitable for low resource language with limited digital work and absence of named entity tagged data. We have framed a suitable algorithm using the rules for solving the named entity recognition task for obtaining a desirable result.

Unsupervised Paraphrasing Consistency Training for Low Resource Named Entity Recognition

10.18653/v1/2021.emnlp-main.430 ◽

2021 ◽

Author(s):

Rui Wang ◽

Ricardo Henao

Keyword(s):

Entity Recognition ◽

Low Resource ◽

Embedding Transfer for Low-Resource Medical Named Entity Recognition: A Case Study on Patient Mobility

10.18653/v1/w18-2301 ◽

2018 ◽

Cited By ~ 3

Author(s):

Denis Newman-Griffis ◽

Ayah Zirikly

Keyword(s):

Entity Recognition ◽

Patient Mobility ◽

Low Resource ◽

Soft Gazetteers for Low-Resource Named Entity Recognition

10.18653/v1/2020.acl-main.722 ◽

2020 ◽

Author(s):

Shruti Rijhwani ◽

Shuyan Zhou ◽

Graham Neubig ◽

Jaime Carbonell

Keyword(s):

Entity Recognition ◽

Low Resource ◽

Low Resource Named Entity Recognition Using Contextual Word Representation and Neural Cross-Lingual Knowledge Transfer

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-36708-4_25 ◽

2019 ◽

pp. 299-311

Author(s):

Soyeon Caren Han ◽

Yingru Lin ◽

Siqu Long ◽

Josiah Poon

Keyword(s):

Knowledge Transfer ◽

Entity Recognition ◽

Low Resource ◽

Named Entity ◽

Word Representation ◽

Cross Lingual

Combining rule-based and statistical mechanisms for low-resource named entity recognition

Machine Translation ◽

10.1007/s10590-017-9208-0 ◽

2017 ◽

Vol 32 (1-2) ◽

pp. 31-43 ◽

Cited By ~ 2

Author(s):

Ryan Gabbard ◽

Jay DeYoung ◽

Constantine Lignos ◽

Marjorie Freedman ◽

Ralph Weischedel

Keyword(s):

Entity Recognition ◽

Rule Based ◽

Low Resource ◽

PDALN: Progressive Domain Adaptation over a Pre-trained Model for Low-Resource Cross-Domain Named Entity Recognition

10.18653/v1/2021.emnlp-main.442 ◽

2021 ◽

Author(s):

Tao Zhang ◽

Congying Xia ◽

Philip S. Yu ◽

Zhiwei Liu ◽

Shu Zhao

Keyword(s):

Domain Adaptation ◽

Entity Recognition ◽

Low Resource ◽

Named Entity ◽

Cross Domain

Named Entity Recognition with Word Embeddings and Wikipedia Categories for a Low-Resource Language

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3015467 ◽

2017 ◽

Vol 16 (3) ◽

pp. 1-19 ◽

Cited By ~ 12

Author(s):

Arjun Das ◽

Debasis Ganguly ◽

Utpal Garain

Keyword(s):

Entity Recognition ◽

Word Embeddings ◽

Low Resource ◽

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

Improving Low Resource Named Entity Recognition using Cross-lingual Knowledge Transfer

10.24963/ijcai.2018/566 ◽

2018 ◽

Cited By ~ 3

Author(s):

Xiaocheng Feng ◽

Xiachong Feng ◽

Bing Qin ◽

Zhangyin Feng ◽

Ting Liu

Keyword(s):

Neural Networks ◽

Knowledge Transfer ◽

State Of The Art ◽

Entity Recognition ◽

Semantic Representations ◽

Low Resource ◽

Named Entity ◽

High Resource ◽

Distribution Features

Neural networks have been widely used for high resource language (e.g. English) named entity recognition (NER) and have shown state-of-the-art results.However, for low resource languages, such as Dutch, Spanish, due to the limitation of resources and lack of annotated data, taggers tend to have lower performances.To narrow this gap, we propose three novel strategies to enrich the semantic representations of low resource languages: we first develop neural networks to improve low resource word representations by knowledge transfer from high resource language using bilingual lexicons. Further, a lexicon extension strategy is designed to address out-of lexicon problem by automatically learning semantic projections.Thirdly, we regard word-level entity type distribution features as an external language-independent knowledge and incorporate them into our neural architecture. Experiments on two low resource languages (including Dutch and Spanish) demonstrate the effectiveness of these additional semantic representations (average 4.8\% improvement). Moreover, on Chinese OntoNotes 4.0 dataset, our approach achieved an F-score of 83.07\% with 2.91\% absolute gain compared to the state-of-the-art results.

Improving Low-Resource Named Entity Recognition using Joint Sentence and Token Labeling

10.18653/v1/2020.acl-main.523 ◽

2020 ◽

Author(s):

Canasai Kruengkrai ◽

Thien Hai Nguyen ◽

Sharifah Mahani Aljunied ◽

Lidong Bing

Keyword(s):

Entity Recognition ◽

Low Resource ◽