scholarly journals Evaluating Richer Features and Varied Machine Learning Models for Subjectivity Classification of Book Review Sentences in Portuguese

Information ◽  
2020 ◽  
Vol 11 (9) ◽  
pp. 437
Author(s):  
Luana Balador Belisário ◽  
Luiz Gabriel Ferreira ◽  
Thiago Alexandre Salgueiro Pardo

Texts published on social media have been a valuable source of information for companies and users, as the analysis of this data helps improving/selecting products and services of interest. Due to the huge amount of data, techniques for automatically analyzing user opinions are necessary. The research field that investigates these techniques is called sentiment analysis. This paper focuses specifically on the task of subjectivity classification, which aims to predict whether a text passage conveys an opinion. We report the study and comparison of machine learning methods of different paradigms to perform subjectivity classification of book review sentences in Portuguese, which have shown to be a challenging domain in the area. Specifically, we explore richer features for the task, using several lexical, centrality-based and discourse features. We show the contributions of the different feature sets and evidence that the combination of lexical, centrality-based and discourse features produce better results than any of the feature sets individually. Additionally, by analyzing the achieved results and the acquired knowledge by some symbolic machine learning methods, we show that some discourse relations may clearly signal subjectivity. Our corpus annotation also reveals some distinctive discourse structuring patterns for sentence subjectivity.

Author(s):  
Matheus del Valle ◽  
Kleber Stancari ◽  
Pedro Arthur Augusto de Castro ◽  
Moises Oliveira dos Santos ◽  
Denise Maria Zezell

ACS Omega ◽  
2018 ◽  
Vol 3 (11) ◽  
pp. 15837-15849 ◽  
Author(s):  
Yang Li ◽  
Yujia Tian ◽  
Zijian Qin ◽  
Aixia Yan

PLoS ONE ◽  
2016 ◽  
Vol 11 (12) ◽  
pp. e0166898 ◽  
Author(s):  
Monique A. Ladds ◽  
Adam P. Thompson ◽  
David J. Slip ◽  
David P. Hocking ◽  
Robert G. Harcourt

Author(s):  
Ravi Singh ◽  
Ankit Ganeshpurkar ◽  
Powsali Ghosh ◽  
Ankit Vyankatrao Pokle ◽  
Devendra Kumar ◽  
...  

2020 ◽  
Vol 493 (3) ◽  
pp. 4209-4228 ◽  
Author(s):  
Ting-Yun Cheng ◽  
Christopher J Conselice ◽  
Alfonso Aragón-Salamanca ◽  
Nan Li ◽  
Asa F L Bluck ◽  
...  

ABSTRACT There are several supervised machine learning methods used for the application of automated morphological classification of galaxies; however, there has not yet been a clear comparison of these different methods using imaging data, or an investigation for maximizing their effectiveness. We carry out a comparison between several common machine learning methods for galaxy classification [Convolutional Neural Network (CNN), K-nearest neighbour, logistic regression, Support Vector Machine, Random Forest, and Neural Networks] by using Dark Energy Survey (DES) data combined with visual classifications from the Galaxy Zoo 1 project (GZ1). Our goal is to determine the optimal machine learning methods when using imaging data for galaxy classification. We show that CNN is the most successful method of these ten methods in our study. Using a sample of ∼2800 galaxies with visual classification from GZ1, we reach an accuracy of ∼0.99 for the morphological classification of ellipticals and spirals. The further investigation of the galaxies that have a different ML and visual classification but with high predicted probabilities in our CNN usually reveals the incorrect classification provided by GZ1. We further find the galaxies having a low probability of being either spirals or ellipticals are visually lenticulars (S0), demonstrating that supervised learning is able to rediscover that this class of galaxy is distinct from both ellipticals and spirals. We confirm that ∼2.5 per cent galaxies are misclassified by GZ1 in our study. After correcting these galaxies’ labels, we improve our CNN performance to an average accuracy of over 0.99 (accuracy of 0.994 is our best result).


Sign in / Sign up

Export Citation Format

Share Document