scholarly journals Movie Aspects Identification Model for Aspect Based Sentiment Analysis

2020 ◽  
Vol 49 (4) ◽  
pp. 564-582
Author(s):  
Jibran Mir ◽  
Azhar Mahmood

Aspect Based Sentiment Analysis techniques have been applied in several application domains. From the last two decades, these techniques have been developed mostly for product and service application domains. However, very few aspect-based sentiment techniques have been proposed for the movie application domain. Moreover, these techniques only mine specific aspects (Script, Director, and Actor) of a movie application domain, nevertheless, the movie application domain is more complex than the product and service application domain. Since, it contains NER (Named Entity Recognition) problem and it cannot be ignored, since there is an opinion often associated with it. Consequently, in this paper MAIM (Movie Aspect Identification Model) is proposed that can extract not only movie specific aspects, also identifies NEs (Named Entities) such as Person Name and Movie Title. The three main contributions are 1) the identification of infrequent aspects, 2) the identification of NE (named entity) in movie application domain, 3) identifying N-gram opinion words as an entity. MAIM incorporates the BiLSTM-CRF hybrid technique and is implemented on the movie application domain having precision 89.9%, recall 88.9% and f1-measure 89.4%. The experimental results show that MAIM performs better than baseline models CRF and LSTM-CRF.

Author(s):  
Edgar Casasola Murillo ◽  
Raquel Fonseca

Abstract: One of the major consequences of the growth of social networks has been the generation of huge volumes of content. The text that is generated in social networks constitutes a new type of content, that is short, informal, lacking grammar in some cases, and noise prone. Given the volume of information that is produced every day, a manual processing of this data is unpractical, causing the need of exploring and applying automatic processing strategies, like Entity Recognition (ER). It becomes necessary to evaluate the performance of traditional ER algorithms in corpus with those characteristics. This paper presents the results of applying AlchemyAPI y Dandelion API algorithms in a corpus provided by The SemEval-2015 Aspect Based Sentiment Analysis Conference. The entities recognized by each algorithm were compared against the ones annotated in the collection in order to calculate their precision and recall. Dandelion API got better results than AlchemyAPI with the given corpus.  Spanish Abstract: Una de las principales consecuencias del auge actual de las redes sociales es la generación de grandes volúmenes de información. El texto generado en estas redes corresponde a un nuevo género de texto: corto, informal, gramaticalmente deficiente y propenso a ruido. Debido a la tasa de producción de la información, el procesamiento manual resulta poco práctico, surgiendo así la necesidad de aplicar estrategias de procesamiento automático, como Reconocimiento de Entidades (RE). Debido a las características del contenido, surge además la necesidad de evaluar el desempeño de los algoritmos tradicionales, en corpus extraídos de estas redes sociales. Este trabajo presenta los resultados obtenidos al aplicar los algoritmos de AlchemyAPI y Dandelion API en un corpus provisto por la conferencia The SemEval-2015 Aspect Based Sentiment Analysis. Las entidades reconocidas por cada algoritmo fueron comparadas con las anotadas en la colección, para calcular su precisión y exhaustividad. Dandelion API obtuvo mejores resultados que AlchemyAPI en el corpus dado.


2021 ◽  
Vol 11 (18) ◽  
pp. 8682
Author(s):  
Ching-Sheng Lin ◽  
Jung-Sing Jwo ◽  
Cheng-Hsiung Lee

Clinical Named Entity Recognition (CNER) focuses on locating named entities in electronic medical records (EMRs) and the obtained results play an important role in the development of intelligent biomedical systems. In addition to the research in alphabetic languages, the study of non-alphabetic languages has attracted considerable attention as well. In this paper, a neural model is proposed to address the extraction of entities from EMRs written in Chinese. To avoid erroneous noise being caused by the Chinese word segmentation, we employ the character embeddings as the only feature without extra resources. In our model, concatenated n-gram character embeddings are used to represent the context semantics. The self-attention mechanism is then applied to model long-range dependencies of embeddings. The concatenation of the new representations obtained by the attention module is taken as the input to bidirectional long short-term memory (BiLSTM), followed by a conditional random field (CRF) layer to extract entities. The empirical study is conducted on the CCKS-2017 Shared Task 2 dataset to evaluate our method and the experimental results show that our model outperforms other approaches.


2019 ◽  
Vol 178 (46) ◽  
pp. 18-23
Author(s):  
Sangeeta Oswal ◽  
Ravikumar Soni ◽  
Omkar Narvekar ◽  
Abhijit Pradha

2021 ◽  
Vol 11 (22) ◽  
pp. 11017
Author(s):  
László Nemes ◽  
Attila Kiss

Social media platforms are increasingly being used to communicate information, something which has only intensified during the pandemic. News portals and governments are also increasing attention to digital communications, announcements and response or reaction monitoring. Twitter, as one of the largest social networking sites, which has become even more important in the communication of information during the pandemic, provides space for a lot of different opinions and news, with many discussions as well. In this paper, we look at the sentiments of people and we use tweets to determine how people have related to COVID-19 over a given period of time. These sentiment analyses are augmented with information extraction and named entity recognition to get an even more comprehensive picture. The sentiment analysis is based on the ’Bidirectional encoder representations from transformers’ (BERT) model, which is the basic measurement model for the comparisons. We consider BERT as the baseline and compare the results with the RNN, NLTK and TextBlob sentiment analyses. The RNN results are significantly closer to the benchmark results given by BERT, both models are able to categorize all tweets without a single tweet fall into the neutral category. Then, via a deeper analysis of these results, we can get an even more concise picture of people’s emotional state in the given period of time. The data from these analyses further support the emotional categories, and provide a deeper understanding that can provide a solid starting point for other disciplines as well, such as linguistics or psychology. Thus, the sentiment analysis, supplemented with information extraction and named entity recognition analyses, can provide a supported and deeply explored picture of specific sentiment categories and user attitudes.


2015 ◽  
Vol 21 (4) ◽  
pp. 653-659 ◽  
Author(s):  
ROBERT DALE

AbstractWith NLP services now widely available via cloud APIs, tasks like named entity recognition and sentiment analysis are virtually commodities. We look at what's on offer, and make some suggestions for how to get rich.


2020 ◽  
Vol 27 (1) ◽  
pp. 35-64
Author(s):  
Emre Kağan Akkaya ◽  
Burcu Can

AbstractIn this article, we investigate using deep neural networks with different word representation techniques for named entity recognition (NER) on Turkish noisy text. We argue that valuable latent features for NER can, in fact, be learned without using any hand-crafted features and/or domain-specific resources such as gazetteers and lexicons. In this regard, we utilize character-level, character n-gram-level, morpheme-level, and orthographic character-level word representations. Since noisy data with NER annotation are scarce for Turkish, we introduce a transfer learning model in order to learn infrequent entity types as an extension to the Bi-LSTM-CRF architecture by incorporating an additional conditional random field (CRF) layer that is trained on a larger (but formal) text and a noisy text simultaneously. This allows us to learn from both formal and informal/noisy text, thus improving the performance of our model further for rarely seen entity types. We experimented on Turkish as a morphologically rich language and English as a relatively morphologically poor language. We obtained an entity-level F1 score of 67.39% on Turkish noisy data and 45.30% on English noisy data, which outperforms the current state-of-art models on noisy text. The English scores are lower compared to Turkish scores because of the intense sparsity in the data introduced by the user writing styles. The results prove that using subword information significantly contributes to learning latent features for morphologically rich languages.


Sign in / Sign up

Export Citation Format

Share Document