scholarly journals Using Chinese Glyphs for Named Entity Recognition (Student Abstract)

2020 ◽  
Vol 34 (10) ◽  
pp. 13921-13922
Author(s):  
Chan Hee Song ◽  
Arijit Sehanobish

Most Named Entity Recognition (NER) systems use additional features like part-of-speech (POS) tags, shallow parsing, gazetteers, etc. Adding these external features to NER systems have been shown to have a positive impact. However, creating gazetteers or taggers can take a lot of time and may require extensive data cleaning. In this work instead of using these traditional features we use lexicographic features of Chinese characters. Chinese characters are composed of graphical components called radicals and these components often have some semantic indicators. We propose CNN based models that incorporate this semantic information and use them for NER. Our models show an improvement over the baseline BERT-BiLSTM-CRF model. We present one of the first studies on Chinese OntoNotes v5.0 and show an improvement of + .64 F1 score over the baseline. We present a state-of-the-art (SOTA) F1 score of 71.81 on the Weibo dataset, show a competitive improvement of + 0.72 over baseline on the ResumeNER dataset, and a SOTA F1 score of 96.49 on the MSRA dataset.

Author(s):  
Victor Sanh ◽  
Thomas Wolf ◽  
Sebastian Ruder

Much effort has been devoted to evaluate whether multi-task learning can be leveraged to learn rich representations that can be used in various Natural Language Processing (NLP) down-stream applications. However, there is still a lack of understanding of the settings in which multi-task learning has a significant effect. In this work, we introduce a hierarchical model trained in a multi-task learning setup on a set of carefully selected semantic tasks. The model is trained in a hierarchical fashion to introduce an inductive bias by supervising a set of low level tasks at the bottom layers of the model and more complex tasks at the top layers of the model. This model achieves state-of-the-art results on a number of tasks, namely Named Entity Recognition, Entity Mention Detection and Relation Extraction without hand-engineered features or external NLP tools like syntactic parsers. The hierarchical training supervision induces a set of shared semantic representations at lower layers of the model. We show that as we move from the bottom to the top layers of the model, the hidden states of the layers tend to represent more complex semantic information.


Author(s):  
M. Bevza

We analyze neural network architectures that yield state of the art results on named entity recognition task and propose a number of new architectures for improving results even further. We have analyzed a number of ideas and approaches that researchers have used to achieve state of the art results in a variety of NLP tasks. In this work, we present a few architectures which we consider to be most likely to improve the existing state of the art solutions for named entity recognition task and part of speech tasks. The architectures are inspired by recent developments in multi-task learning. This work tests the hypothesis that NER and POS are related tasks and adding information about POS tags as input to the network can help achieve better NER results. And vice versa, information about NER tags can help solve the task of POS tagging. This work also contains the implementation of the network and results of the experiments together with the conclusions and future work.


Author(s):  
M. Bevza

We analyze neural network architectures that yield state of the art results on named entity recognition task and propose a new architecture for improving results even further. We have analyzed a number of ideas and approaches that researchers have used to achieve state of the art results in a variety of NLP tasks. In this work, we present a few of them which we consider to be most likely to improve existing state of the art solutions for named entity recognition task. The architecture is inspired by recent developments in language modeling task. The suggested solution is based on a multi-task learning approach. We incorporate part of speech tags as input for the network. Part of speech tags to be yielded by some state of the art tagger and also ask the network to produce those tags in addition to the main named entity recognition tags. This way knowledge distillation from a strong part of speech tagger to our smaller network is happening. We hypothesize that designing neural network architecture in this way improves the generalizability of the system and provide arguments to support this statement.


2020 ◽  
Vol 34 (05) ◽  
pp. 9090-9097
Author(s):  
Niels Van der Heijden ◽  
Samira Abnar ◽  
Ekaterina Shutova

The lack of annotated data in many languages is a well-known challenge within the field of multilingual natural language processing (NLP). Therefore, many recent studies focus on zero-shot transfer learning and joint training across languages to overcome data scarcity for low-resource languages. In this work we (i) perform a comprehensive comparison of state-of-the-art multilingual word and sentence encoders on the tasks of named entity recognition (NER) and part of speech (POS) tagging; and (ii) propose a new method for creating multilingual contextualized word embeddings, compare it to multiple baselines and show that it performs at or above state-of-the-art level in zero-shot transfer settings. Finally, we show that our method allows for better knowledge sharing across languages in a joint training setting.


Author(s):  
Artūrs Znotiņš ◽  
Guntis Barzdiņš

This paper presents LVBERT – the first publicly available monolingual language model pre-trained for Latvian. We show that LVBERT improves the state-of-the-art for three Latvian NLP tasks including Part-of-Speech tagging, Named Entity Recognition and Universal Dependency parsing. We release LVBERT to facilitate future research and downstream applications for Latvian NLP.


2021 ◽  
pp. 1-12
Author(s):  
Yingwen Fu ◽  
Nankai Lin ◽  
Xiaotian Lin ◽  
Shengyi Jiang

Named entity recognition (NER) is fundamental to natural language processing (NLP). Most state-of-the-art researches on NER are based on pre-trained language models (PLMs) or classic neural models. However, these researches are mainly oriented to high-resource languages such as English. While for Indonesian, related resources (both in dataset and technology) are not yet well-developed. Besides, affix is an important word composition for Indonesian language, indicating the essentiality of character and token features for token-wise Indonesian NLP tasks. However, features extracted by currently top-performance models are insufficient. Aiming at Indonesian NER task, in this paper, we build an Indonesian NER dataset (IDNER) comprising over 50 thousand sentences (over 670 thousand tokens) to alleviate the shortage of labeled resources in Indonesian. Furthermore, we construct a hierarchical structured-attention-based model (HSA) for Indonesian NER to extract sequence features from different perspectives. Specifically, we use an enhanced convolutional structure as well as an enhanced attention structure to extract deeper features from characters and tokens. Experimental results show that HSA establishes competitive performance on IDNER and three benchmark datasets.


2021 ◽  
Vol 54 (1) ◽  
pp. 1-39
Author(s):  
Zara Nasar ◽  
Syed Waqar Jaffry ◽  
Muhammad Kamran Malik

With the advent of Web 2.0, there exist many online platforms that result in massive textual-data production. With ever-increasing textual data at hand, it is of immense importance to extract information nuggets from this data. One approach towards effective harnessing of this unstructured textual data could be its transformation into structured text. Hence, this study aims to present an overview of approaches that can be applied to extract key insights from textual data in a structured way. For this, Named Entity Recognition and Relation Extraction are being majorly addressed in this review study. The former deals with identification of named entities, and the latter deals with problem of extracting relation between set of entities. This study covers early approaches as well as the developments made up till now using machine learning models. Survey findings conclude that deep-learning-based hybrid and joint models are currently governing the state-of-the-art. It is also observed that annotated benchmark datasets for various textual-data generators such as Twitter and other social forums are not available. This scarcity of dataset has resulted into relatively less progress in these domains. Additionally, the majority of the state-of-the-art techniques are offline and computationally expensive. Last, with increasing focus on deep-learning frameworks, there is need to understand and explain the under-going processes in deep architectures.


Author(s):  
Rodrigo Agerri ◽  
German Rigau

We present a multilingual Named Entity Recognition approach based on a robust and general set of features across languages and datasets. Our system combines shallow local information with clustering semi-supervised features induced on large amounts of unlabeled text. Understanding via empiricalexperimentation how to effectively combine various types of clustering features allows us to seamlessly export our system to other datasets and languages. The result is a simple but highly competitive system which obtains state of the art results across five languages and twelve datasets. The results are reported on standard shared task evaluation data such as CoNLL for English, Spanish and Dutch. Furthermore, and despite the lack of linguistically motivated features, we also report best results for languages such as Basque and German. In addition, we demonstrate that our method also obtains very competitive results even when the amount of supervised data is cut by half, alleviating the dependency on manually annotated data. Finally, the results show that our emphasis on clustering features is crucial to develop robust out-of-domain models. The system and models are freely available to facilitate its use and guarantee the reproducibility of results.


2015 ◽  
Vol 12 (2) ◽  
pp. 465-486
Author(s):  
Dejan Mancev ◽  
Branimir Todorovic

Structured learning algorithms usually require inference during the training procedure. Due to their exponential size of output space, the parameter update is performed only on a relatively small collection built from the ?best? structures. The k-best MIRA is an example of an online algorithm which seeks optimal parameters by making updates on k structures with the highest score at a time. Following the idea of using k-best structures during the learning process, in this paper we introduce four new k-best extensions of max-margin structured algorithms. We discuss their properties and connection, and evaluate all algorithms on two sequence labeling problems, the shallow parsing and named entity recognition. The experiments show how the proposed algorithms are affected by the changes of k in terms of the F-measure and computational time, and that the proposed algorithms can improve results in comparison to the single best case. Moreover, the restriction to the single best case produces a comparison of the existing algorithms.


Sign in / Sign up

Export Citation Format

Share Document