Design of text sentiment analysis tool using feature extraction based on fusing machine learning algorithms

Author(s):  
P. Ajitha ◽  
A. Sivasangari ◽  
R. Immanuel Rajkumar ◽  
S. Poonguzhali

Text Sentiment Analysis is a system where text feeling polarity is positive or negative or neutral from a series of texts or documents or public opinions on a particular product or general subject. Using machine learning and natural language processing techniques, the current work aims to gain insight into sentiment mining on tweets. Text classification is accomplished using Machine Learning Algorithm-based fusion technique. This research suggested a system for grading feelings based on a lexicon. Bag-of-words (BOW) or lexicon-based methodology is currently the main standard way of modeling text for machine learning in sentiment analysis approaches. Marketers can use sentiment analysis to analyze their business and services, public opinion, or to evaluate customer satisfaction. Organizations can even use this analysis to gather significant feedback on issues related to newly released products. The main objective of this is to resolve the data overload problem.

The process of discovering and analyzing the customer feedback using Natural Language Processing (NLP) is said to be sentiment analysis. Based on the surge over the concept of rating level in sentiment analysis, sentiment is utilized as an attribute for certain aspects or features that get expressed and more attention are provided to the problem of detecting the customer reviews. Despite the wide use and popularity of some methods, a better technique for identifying the polarity of a text data is hard to find. Machine learning has recently attracted attention as an approach for sentiment analysis. This work extends the idea of evaluating the performance of various Machine Learning (ML) classifiers namely logistic regression, Naive Bayes, Support Vector Machine (SVM) and Neural Network (NN).To show their effectiveness in sentiment mining of customer product reviews, the customer feedback has been collected from Grocery and Gourmet Food. Nearly 90 thousands customers feedback reviews of various product related categories namely Product ID, rating, review test, review time reviewer ID and reviewer name are used in this analysis. The performance of the classifiers is measured in terms of accuracy, specificity and sensitivity. From the experimental results, the better machine learning classification algorithm is proposed for sentiment mining using online shopping customer review data.


Author(s):  
Kirti Jain

Sentiment analysis, also known as sentiment mining, is a submachine learning task where we want to determine the overall sentiment of a particular document. With machine learning and natural language processing (NLP), we can extract the information of a text and try to classify it as positive, neutral, or negative according to its polarity. In this project, We are trying to classify Twitter tweets into positive, negative, and neutral sentiments by building a model based on probabilities. Twitter is a blogging website where people can quickly and spontaneously share their feelings by sending tweets limited to 140 characters. Because of its use of Twitter, it is a perfect source of data to get the latest general opinion on anything.


2020 ◽  
Vol 7 (10) ◽  
pp. 380-389
Author(s):  
Asogwa D.C ◽  
Anigbogu S.O ◽  
Anigbogu G.N ◽  
Efozia F.N

Author's age prediction is the task of determining the author's age by studying the texts written by them. The prediction of author’s age can be enlightening about the different trends, opinions social and political views of an age group. Marketers always use this to encourage a product or a service to an age group following their conveyed interests and opinions. Methodologies in natural language processing have made it possible to predict author’s age from text by examining the variation of linguistic characteristics. Also, many machine learning algorithms have been used in author’s age prediction. However, in social networks, computational linguists are challenged with numerous issues just as machine learning techniques are performance driven with its own challenges in realistic scenarios. This work developed a model that can predict author's age from text with a machine learning algorithm (Naïve Bayes) using three types of features namely, content based, style based and topic based. The trained model gave a prediction accuracy of 80%.


IoT ◽  
2020 ◽  
Vol 1 (2) ◽  
pp. 218-239 ◽  
Author(s):  
Ravikumar Patel ◽  
Kalpdrum Passi

In the derived approach, an analysis is performed on Twitter data for World Cup soccer 2014 held in Brazil to detect the sentiment of the people throughout the world using machine learning techniques. By filtering and analyzing the data using natural language processing techniques, sentiment polarity was calculated based on the emotion words detected in the user tweets. The dataset is normalized to be used by machine learning algorithms and prepared using natural language processing techniques like word tokenization, stemming and lemmatization, part-of-speech (POS) tagger, name entity recognition (NER), and parser to extract emotions for the textual data from each tweet. This approach is implemented using Python programming language and Natural Language Toolkit (NLTK). A derived algorithm extracts emotional words using WordNet with its POS (part-of-speech) for the word in a sentence that has a meaning in the current context, and is assigned sentiment polarity using the SentiWordNet dictionary or using a lexicon-based method. The resultant polarity assigned is further analyzed using naïve Bayes, support vector machine (SVM), K-nearest neighbor (KNN), and random forest machine learning algorithms and visualized on the Weka platform. Naïve Bayes gives the best accuracy of 88.17% whereas random forest gives the best area under the receiver operating characteristics curve (AUC) of 0.97.


2020 ◽  
Vol 17 (9) ◽  
pp. 4294-4298
Author(s):  
B. R. Sunil Kumar ◽  
B. S. Siddhartha ◽  
S. N. Shwetha ◽  
K. Arpitha

This paper intends to use distinct machine learning algorithms and exploring its multi-features. The primary advantage of machine learning is, a machine learning algorithm can predict its work automatically by learning what to do with information. This paper reveals the concept of machine learning and its algorithms which can be used for different applications such as health care, sentiment analysis and many more. Sometimes the programmers will get confused which algorithm to apply for their applications. This paper provides an idea related to the algorithm used on the basis of how accurately it fits. Based on the collected data, one of the algorithms can be selected based upon its pros and cons. By considering the data set, the base model is developed, trained and tested. Then the trained model is ready for prediction and can be deployed on the basis of feasibility.


2020 ◽  
Vol 17 (7) ◽  
pp. 2869-2875
Author(s):  
Sajay Thomas Samuel ◽  
Booma Poolan Marikannan

Machine learning can help people to perform complex tasks and solve problems as it uses historical data to learn its pattern and make predictions based on the past data. This research addresses the problem about movie reviews on social media specifically Twitter; where it will gather the tweets on movie reviews and display a rating based on the sentiment of the tweet. Twitter is an online social media website where people from all walks of life communicate by tweeting short updates without exceeding the character limit which is 240 characters. Twitter is continuously growing as a business and became one of the biggest platform for communication and instant messaging. Due to the large number of users, there are voluminous amounts of data available that can be used for more in depth information and insights and to get the sentiments from analysing the tweets. In today’s world, there are many applications that are using sentiment analysis in various fields such as to gets insights about a particular brand or product. To do sentiment analysis using the traditional ways can be time consuming and becomes very complex. The aim of this research is to investigate about the domain of sentiment analysis and incorporate a machine learning algorithm to create a system that is able to get and display the ratings of a particular movie. The machine learning algorithms used are Naïve Bayes Classifier and SVM. The algorithm with better accuracy will be chosen for the implementation phase.


Sentiment Analysis is individuals' opinions and feedbacks study towards a substance, which can be items, services, movies, people or events. The opinions are mostly expressed as remarks or reviews. With the social network, gatherings and websites, these reviews rose as a significant factor for the client’s decision to buy anything or not. These days, a vast scalable computing environment provides us with very sophisticated way of carrying out various data-intensive natural language processing (NLP) and machine-learning tasks to examine these reviews. One such example is text classification, a compelling method for predicting the clients' sentiment. In this paper, we attempt to center our work of sentiment analysis on movie review database. We look at the sentiment expression to order the extremity of the movie reviews on a size of 0(highly disliked) to 4(highly preferred) and perform feature extraction and ranking and utilize these features to prepare our multilabel classifier to group the movie review into its right rating. This paper incorporates sentiment analysis utilizing feature-based opinion mining and managed machine learning. The principle center is to decide the extremity of reviews utilizing nouns, verbs, and adjectives as opinion words. In addition, a comparative study on different classification approaches has been performed to determine the most appropriate classifier to suit our concern problem space. In our study, we utilized six distinctive machine learning algorithms – Naïve Bayes, Logistic Regression, SVM (Support Vector Machine), RF (Random Forest) KNN (K nearest neighbors) and SoftMax Regression.


E-commerce is evolving at a rapid pace that new doors have been opened for the people to express their emotions towards the products. The opinions of the customers plays an important role in the e-commerce sites. It is practically a tedious job to analyze the opinions of users and form a pros and cons for respective products. This paper develops a solution through machine learning algorithms by pre-processing the reviews based on features of mobile products. This mainly focus on aspect level of opinions which uses SentiWordNet, Natural Language Processing and aggregate scores for analyzing the text reviews. The experimental results provide the visual representation of products which provide better understanding of product reviews rather than reading through long textual reviews which includes strengths and weakness of the product using Naive Bayes algorithm. This results also helps the e-commerce vendors to overcome the weakness of the products and meet the customer expectations.


Author(s):  
Anurag Langan

Grading student answers is a tedious and time-consuming task. A study had found that almost on average around 25% of a teacher's time is spent in scoring the answer sheets of students. This time could be utilized in much better ways if computer technology could be used to score answers. This system will aim to grade student answers using the various Natural Language processing techniques and Machine Learning algorithms available today.


Author(s):  
Zahra Bokaee Nezhad ◽  
Mohammad Ali Deihimi

Sarcasm is a form of communication where the individual states the opposite of what is implied. Therefore, detecting a sarcastic tone is somewhat complicated due to its ambiguous nature. On the other hand, identification of sarcasm is vital to various natural language processing tasks such as sentiment analysis and text summarisation. However, research on sarcasm detection in Persian is very limited. This paper investigated the sarcasm detection technique on Persian tweets by combining deep learning-based and machine learning-based approaches. Four sets of features that cover different types of sarcasm were proposed, namely deep polarity, sentiment, part of speech, and punctuation features. These features were utilised to classify the tweets as sarcastic and nonsarcastic. In this study, the deep polarity feature was proposed by conducting a sentiment analysis using deep neural network architecture. In addition, to extract the sentiment feature, a Persian sentiment dictionary was developed, which consisted of four sentiment categories. The study also used a new Persian proverb dictionary in the preparation step to enhance the accuracy of the proposed model. The performance of the model is analysed using several standard machine learning algorithms. The results of the experiment showed that the method outperformed the baseline method and reached an accuracy of 80.82%. The study also examined the importance of each proposed feature set and evaluated its added value to the classification.


Sign in / Sign up

Export Citation Format

Share Document