scholarly journals Named Entity Recognition using Gazetteer Method and N-gram Technique for an Inflectional Language: A Hybrid Approach

2013 ◽  
Vol 84 (9) ◽  
pp. 31-35 ◽  
Author(s):  
Arindam Dey ◽  
Bipul Syam Prukayastha
Author(s):  
Hamed Moradi ◽  
Farid Ahmadi ◽  
Mohammad-Reza Feizi-Derakhshi

2020 ◽  
Vol 49 (4) ◽  
pp. 564-582
Author(s):  
Jibran Mir ◽  
Azhar Mahmood

Aspect Based Sentiment Analysis techniques have been applied in several application domains. From the last two decades, these techniques have been developed mostly for product and service application domains. However, very few aspect-based sentiment techniques have been proposed for the movie application domain. Moreover, these techniques only mine specific aspects (Script, Director, and Actor) of a movie application domain, nevertheless, the movie application domain is more complex than the product and service application domain. Since, it contains NER (Named Entity Recognition) problem and it cannot be ignored, since there is an opinion often associated with it. Consequently, in this paper MAIM (Movie Aspect Identification Model) is proposed that can extract not only movie specific aspects, also identifies NEs (Named Entities) such as Person Name and Movie Title. The three main contributions are 1) the identification of infrequent aspects, 2) the identification of NE (named entity) in movie application domain, 3) identifying N-gram opinion words as an entity. MAIM incorporates the BiLSTM-CRF hybrid technique and is implemented on the movie application domain having precision 89.9%, recall 88.9% and f1-measure 89.4%. The experimental results show that MAIM performs better than baseline models CRF and LSTM-CRF.


2021 ◽  
Vol 11 (18) ◽  
pp. 8682
Author(s):  
Ching-Sheng Lin ◽  
Jung-Sing Jwo ◽  
Cheng-Hsiung Lee

Clinical Named Entity Recognition (CNER) focuses on locating named entities in electronic medical records (EMRs) and the obtained results play an important role in the development of intelligent biomedical systems. In addition to the research in alphabetic languages, the study of non-alphabetic languages has attracted considerable attention as well. In this paper, a neural model is proposed to address the extraction of entities from EMRs written in Chinese. To avoid erroneous noise being caused by the Chinese word segmentation, we employ the character embeddings as the only feature without extra resources. In our model, concatenated n-gram character embeddings are used to represent the context semantics. The self-attention mechanism is then applied to model long-range dependencies of embeddings. The concatenation of the new representations obtained by the attention module is taken as the input to bidirectional long short-term memory (BiLSTM), followed by a conditional random field (CRF) layer to extract entities. The empirical study is conducted on the CCKS-2017 Shared Task 2 dataset to evaluate our method and the experimental results show that our model outperforms other approaches.


Sign in / Sign up

Export Citation Format

Share Document