A machine learning-based sentiment analysis of online product reviews with a novel term weighting and feature selection approach

Sentiment analysis of online product reviews has become a mainstream way for businesses on e-commerce platforms to promote their products and improve user satisfaction. Hence, it is necessary to construct an automatic sentiment analyser for automatic identification of sentiment polarity of the online product reviews. Traditional lexicon-based approaches used for sentiment analysis suffered from several accuracy issues while machine learning techniques require labelled training data. This paper introduces a hybrid sentiment analysis framework to bond the gap between both machine learning and lexicon-based approaches. A novel tunicate swarm algorithm (TSA) based feature reduction is integrated with the proposed hybrid method to solve the scalability issue that arises due to a large feature set. It reduces the feature set size to 43% without changing the accuracy (93%). Besides, it improves the scalability, reduces the computation time and enhances the overall performance of the proposed framework. From experimental analysis, it can be observed that TSA outperforms existing feature selection techniques such as particle swarm optimization and genetic algorithm. Moreover, the proposed approach is analysed with performance metrics such as recall, precision, F1-score, feature size and computation time.

Download Full-text

Preprocessing and Feature Selection Approach for Efficient Sentiment Analysis on Product Reviews

Advances in Intelligent Systems and Computing - Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications ◽

10.1007/978-981-10-3153-3_72 ◽

2017 ◽

pp. 721-730 ◽

Cited By ~ 2

Author(s):

Monalisa Ghosh ◽

Gautam Sanyal

Keyword(s):

Feature Selection ◽

Sentiment Analysis ◽

Product Reviews ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

Early Detection of the Alzheimer’s Disease: A Novel Cognitive Feature Selection Approach Using Machine Learning

Advances in Information, Communication and Cybersecurity - Lecture Notes in Networks and Systems ◽

10.1007/978-3-030-91738-8_35 ◽

2022 ◽

pp. 383-392

Author(s):

Muhammad Irfan ◽

Seyed Shahrestani ◽

Mahmoud Elkhodr

Keyword(s):

Machine Learning ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Feature Selection ◽

Early Detection ◽

Selection Approach ◽

Feature Selection Approach ◽

Cognitive Feature

Download Full-text

Development of machine learning‐based real time scheduling systems: using ensemble based on wrapper feature selection approach

International Journal of Production Research ◽

10.1080/00207543.2011.636389 ◽

2012 ◽

Vol 50 (20) ◽

pp. 5887-5905 ◽

Cited By ~ 6

Author(s):

Yeou-Ren Shiue ◽

Ruey‐Shiang Guh ◽

Ken‐Chun Lee

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Real Time ◽

Real Time Scheduling ◽

Selection Approach ◽

Time Scheduling ◽

Feature Selection Approach ◽

Wrapper Feature Selection

Download Full-text

Machine learning methods with feature selection approach to estimate software services development effort

International Journal of Services Sciences ◽

10.1504/ijssci.2017.10009007 ◽

2017 ◽

Vol 6 (1) ◽

pp. 26

Author(s):

Amid Khatibi Bardsiri ◽

Seyyed Mohsen Hashemi

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Development Effort ◽

Learning Methods ◽

Machine Learning Methods ◽

Selection Approach ◽

Feature Selection Approach ◽

Software Services

Download Full-text

An Effective Multi-Label Feature Selection Model Towards Eliminating Noisy Features

Applied Sciences ◽

10.3390/app10228093 ◽

2020 ◽

Vol 10 (22) ◽

pp. 8093

Author(s):

Jun Wang ◽

Yuanyuan Xu ◽

Hengpeng Xu ◽

Zhe Sun ◽

Zhenglu Yang ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Learning Performance ◽

Space Structures ◽

Learning Tasks ◽

Feature Spaces ◽

Selection Approach ◽

Label Correlations ◽

Feature Selection Approach ◽

Low Dimensional

Feature selection has devoted a consistently great amount of effort to dimension reduction for various machine learning tasks. Existing feature selection models focus on selecting the most discriminative features for learning targets. However, this strategy is weak in handling two kinds of features, that is, the irrelevant and redundant ones, which are collectively referred to as noisy features. These features may hamper the construction of optimal low-dimensional subspaces and compromise the learning performance of downstream tasks. In this study, we propose a novel multi-label feature selection approach by embedding label correlations (dubbed ELC) to address these issues. Particularly, we extract label correlations for reliable label space structures and employ them to steer feature selection. In this way, label and feature spaces can be expected to be consistent and noisy features can be effectively eliminated. An extensive experimental evaluation on public benchmarks validated the superiority of ELC.

Download Full-text

Integration of extreme gradient boosting feature selection approach with machine learning models: application of weather relative humidity prediction

Neural Computing and Applications ◽

10.1007/s00521-021-06362-3 ◽

2021 ◽

Author(s):

Hai Tao ◽

Salih Muhammad Awadh ◽

Sinan Q. Salih ◽

Shafik S. Shafik ◽

Zaher Mundher Yaseen

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Relative Humidity ◽

Gradient Boosting ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Selection Approach ◽

Feature Selection Approach ◽

Machine Learning Models

Download Full-text

Machine learning methods with feature selection approach to estimate software services development effort

International Journal of Services Sciences ◽

10.1504/ijssci.2017.088034 ◽

2017 ◽

Vol 6 (1) ◽

pp. 26

Author(s):

Amid Khatibi Bardsiri ◽

Seyyed Mohsen Hashemi

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Development Effort ◽

Learning Methods ◽

Machine Learning Methods ◽

Selection Approach ◽

Feature Selection Approach ◽

Software Services

Download Full-text

A Feature Selection Approach for Fall Detection Using Various Machine Learning Classifiers

IEEE Access ◽

10.1109/access.2021.3105581 ◽

2021 ◽

pp. 1-1

Author(s):

Tuan Le Minh ◽

Ly Van Tran ◽

Son Vu Truong Dao

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Fall Detection ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

Sentiment Analysis on E-commerce Product using Machine Learning and Combination of TF-IDF and Backward Elimination

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f7889.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 2862-2867

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Feature Selection ◽

Sentiment Analysis ◽

Opinion Mining ◽

Classification Performance ◽

Support Vector ◽

Product Reviews ◽

Feature Selection Technique ◽

Backward Elimination

E-commerce is a website or mobile application platform that help people to buy products. Before purchasing the product, customer will decide to buy it or not by reading the review from previous buyer. There is a problem that there are a lot of review so it will take a long time for customer to read it all. This research will be using sentiment analysis method to classify the review data. Sentiment analysis or opinion mining is a machine learning approach to classify and analyse texts or documents about human’s sentiments, emotions, and opinions. In this research, sentiment analysis was used to classify product reviews from e-commerce websites into positive or negative classes. The results could be processed further and be used to summarize customers' opinions about a certain product without reading every single review. The goal of this research is to optimize classification performance by using feature selection technique. Terms Frequency-Inverse Document Frequency (TF-IDF) feature extraction, Backward Elimination feature selection, and five different classifiers (Naïve Bayes, Support Vector Machine, K-Nearest Neighbour, Decision Tree, Random Forest) were used in analysing the sentiment of the reviews. In this research, the dataset used are Indonesian language and classified into two classes(positive and negative). The best accuracy is achieved by using TF-IDF, Backward Elimination and Support Vector Machine (SVM) with a score of 85.97%, which increases by 7.91% if compared to the process without feature selection. Based on the results, Backward Elimination feature selection succeeded in improving all performance for all classifiers used in this research.

Download Full-text