Named Entity Recognition using Gazetteer Method and N-gram Technique for an Inflectional Language: A Hybrid Approach

Aspect Based Sentiment Analysis techniques have been applied in several application domains. From the last two decades, these techniques have been developed mostly for product and service application domains. However, very few aspect-based sentiment techniques have been proposed for the movie application domain. Moreover, these techniques only mine specific aspects (Script, Director, and Actor) of a movie application domain, nevertheless, the movie application domain is more complex than the product and service application domain. Since, it contains NER (Named Entity Recognition) problem and it cannot be ignored, since there is an opinion often associated with it. Consequently, in this paper MAIM (Movie Aspect Identification Model) is proposed that can extract not only movie specific aspects, also identifies NEs (Named Entities) such as Person Name and Movie Title. The three main contributions are 1) the identification of infrequent aspects, 2) the identification of NE (named entity) in movie application domain, 3) identifying N-gram opinion words as an entity. MAIM incorporates the BiLSTM-CRF hybrid technique and is implemented on the movie application domain having precision 89.9%, recall 88.9% and f1-measure 89.4%. The experimental results show that MAIM performs better than baseline models CRF and LSTM-CRF.

Download Full-text

Rule-based pattern extractor and named entity recognition: A hybrid approach

2010 International Symposium on Information Technology ◽

10.1109/itsim.2010.5561392 ◽

2010 ◽

Cited By ~ 3

Author(s):

Yunita Sari ◽

Mohd Fadzil Hassan ◽

Norshuhani Zamin

Keyword(s):

Hybrid Approach ◽

Named Entity Recognition ◽

Entity Recognition ◽

Rule Based ◽

Named Entity

Download Full-text

A Neural N-Gram-Based Classifier for Chinese Clinical Named Entity Recognition

Applied Sciences ◽

10.3390/app11188682 ◽

2021 ◽

Vol 11 (18) ◽

pp. 8682

Author(s):

Ching-Sheng Lin ◽

Jung-Sing Jwo ◽

Cheng-Hsiung Lee

Keyword(s):

Short Term Memory ◽

Conditional Random Field ◽

Named Entity Recognition ◽

Neural Model ◽

Entity Recognition ◽

Chinese Word Segmentation ◽

Named Entities ◽

Named Entity ◽

Biomedical Systems ◽

N Gram

Clinical Named Entity Recognition (CNER) focuses on locating named entities in electronic medical records (EMRs) and the obtained results play an important role in the development of intelligent biomedical systems. In addition to the research in alphabetic languages, the study of non-alphabetic languages has attracted considerable attention as well. In this paper, a neural model is proposed to address the extraction of entities from EMRs written in Chinese. To avoid erroneous noise being caused by the Chinese word segmentation, we employ the character embeddings as the only feature without extra resources. In our model, concatenated n-gram character embeddings are used to represent the context semantics. The self-attention mechanism is then applied to model long-range dependencies of embeddings. The concatenation of the new representations obtained by the attention module is taken as the input to bidirectional long short-term memory (BiLSTM), followed by a conditional random field (CRF) layer to extract entities. The empirical study is conducted on the CCKS-2017 Shared Task 2 dataset to evaluate our method and the experimental results show that our model outperforms other approaches.

Download Full-text