A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis

Furqan Rustam; Madiha Khalid; Waqar Aslam; Vaibhav Rupapara; Arif Mehmood; Gyu Sang Choi

doi:10.1371/journal.pone.0245909

A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis

PLoS ONE ◽

10.1371/journal.pone.0245909 ◽

2021 ◽

Vol 16 (2) ◽

pp. e0245909

Author(s):

Furqan Rustam ◽

Madiha Khalid ◽

Waqar Aslam ◽

Vaibhav Rupapara ◽

Arif Mehmood ◽

...

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Performance Comparison ◽

Supervised Machine Learning ◽

Accuracy Score ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Analysis Technique ◽

Realistic Assessment

The spread of Covid-19 has resulted in worldwide health concerns. Social media is increasingly used to share news and opinions about it. A realistic assessment of the situation is necessary to utilize resources optimally and appropriately. In this research, we perform Covid-19 tweets sentiment analysis using a supervised machine learning approach. Identification of Covid-19 sentiments from tweets would allow informed decisions for better handling the current pandemic situation. The used dataset is extracted from Twitter using IDs as provided by the IEEE data port. Tweets are extracted by an in-house built crawler that uses the Tweepy library. The dataset is cleaned using the preprocessing techniques and sentiments are extracted using the TextBlob library. The contribution of this work is the performance evaluation of various machine learning classifiers using our proposed feature set. This set is formed by concatenating the bag-of-words and the term frequency-inverse document frequency. Tweets are classified as positive, neutral, or negative. Performance of classifiers is evaluated on the accuracy, precision, recall, and F1 score. For completeness, further investigation is made on the dataset using the Long Short-Term Memory (LSTM) architecture of the deep learning model. The results show that Extra Trees Classifiers outperform all other models by achieving a 0.93 accuracy score using our proposed concatenated features set. The LSTM achieves low accuracy as compared to machine learning classifiers. To demonstrate the effectiveness of our proposed feature set, the results are compared with the Vader sentiment analysis technique based on the GloVe feature extraction approach.

Download Full-text

A performance comparison of machine learning classifiers for Covid-19 Arabic Quarantine tweets sentiment analysis

10.1109/esmarta52612.2021.9515749 ◽

2021 ◽

Author(s):

Abdulqader Mohsen ◽

Yousef Ali ◽

Wedad Al-Sorori ◽

Naseebah A. Maqtary ◽

Belal Al-Fuhaidi ◽

...

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Performance Comparison ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

A Performance

Download Full-text

Financial Context News Sentiment Analysis for the Lithuanian Language

Applied Sciences ◽

10.3390/app11104443 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4443

Author(s):

Rokas Štrimaitis ◽

Pavel Stefanovič ◽

Simona Ramanauskaitė ◽

Asta Slotkienė

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Experimental Investigations ◽

Support Vector ◽

Applied Machine Learning ◽

Bayes Algorithm ◽

Website Content

Financial area analysis is not limited to enterprise performance analysis. It is worth analyzing as wide an area as possible to obtain the full impression of a specific enterprise. News website content is a datum source that expresses the public’s opinion on enterprise operations, status, etc. Therefore, it is worth analyzing the news portal article text. Sentiment analysis in English texts and financial area texts exist, and are accurate, the complexity of Lithuanian language is mostly concentrated on sentiment analysis of comment texts, and does not provide high accuracy. Therefore in this paper, the supervised machine learning model was implemented to assign sentiment analysis on financial context news, gathered from Lithuanian language websites. The analysis was made using three commonly used classification algorithms in the field of sentiment analysis. The hyperparameters optimization using the grid search was performed to discover the best parameters of each classifier. All experimental investigations were made using the newly collected datasets from four Lithuanian news websites. The results of the applied machine learning algorithms show that the highest accuracy is obtained using a non-balanced dataset, via the multinomial Naive Bayes algorithm (71.1%). The other algorithm accuracies were slightly lower: a long short-term memory (71%), and a support vector machine (70.4%).

Download Full-text

A Pragmatic Comparison of Supervised Machine Learning Classifiers for Disease Diagnosis

10.1109/icirca51532.2021.9544582 ◽

2021 ◽

Author(s):

Ifra Altaf ◽

Muheet Ahmed Butt ◽

Majid Zaman

Keyword(s):

Machine Learning ◽

Disease Diagnosis ◽

Supervised Machine Learning ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Supervised Machine Learning Classifiers

Download Full-text

Classifying Lensed Gravitational Waves in the Geometrical Optics Limit with Machine Learning

American Journal of Undergraduate Research ◽

10.33697/ajur.2019.019 ◽

2019 ◽

Vol 16 (2) ◽

pp. 5-16

Author(s):

Amit Singh ◽

Ivan Li ◽

Otto Hannuksela ◽

Tjonnie Li ◽

Kyungmin Kim

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Gravitational Wave ◽

Gravitational Waves ◽

Geometrical Optics ◽

Supervised Machine Learning ◽

Support Vector ◽

Multi Layer Perceptron ◽

Machine Learning Classifiers ◽

Learning Classifiers

Gravitational waves are theorized to be gravitationally lensed when they propagate near massive objects. Such lensing effects cause potentially detectable repeated gravitational wave patterns in ground- and space-based gravitational wave detectors. These effects are difficult to discriminate when the lens is small and the repeated patterns superpose. Traditionally, matched filtering techniques are used to identify gravitational-wave signals, but we instead aim to utilize machine learning techniques to achieve this. In this work, we implement supervised machine learning classifiers (support vector machine, random forest, multi-layer perceptron) to discriminate such lensing patterns in gravitational wave data. We train classifiers with spectrograms of both lensed and unlensed waves using both point-mass and singular isothermal sphere lens models. As the result, classifiers return F1 scores ranging from 0:852 to 0:996, with precisions from 0:917 to 0:992 and recalls ranging from 0:796 to 1:000 depending on the type of classifier and lensing model used. This supports the idea that machine learning classifiers are able to correctly determine lensed gravitational wave signals. This also suggests that in the future, machine learning classifiers may be used as a possible alternative to identify lensed gravitational wave events and to allow us to study gravitational wave sources and massive astronomical objects through further analysis. KEYWORDS: Gravitational Waves; Gravitational Lensing; Geometrical Optics; Machine Learning; Classification; Support Vector Machine; Random Tree Forest; Multi-layer Perceptron

Download Full-text

Sentiment Analysis on E-Learning Using Machine Learning Classifiers in Python

Rising Threats in Expert Applications and Solutions - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-15-6014-9_1 ◽

2020 ◽

pp. 1-8

Author(s):

Shilpa Singh Hanswal ◽

Astha Pareek ◽

Geetika Vyas ◽

Amita Sharma

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

E Learning

Download Full-text

Performance Comparison of Tree-Based Machine Learning Classifiers for Web Usage Mining

Lecture Notes in Electrical Engineering - Proceedings of International Conference on Communication, Circuits, and Systems ◽

10.1007/978-981-33-4866-0_47 ◽

2021 ◽

pp. 379-387

Author(s):

Ruchi Mittal ◽

Varun Malik ◽

Vikas Rattan ◽

Deepika Jhamb

Keyword(s):

Machine Learning ◽

Performance Comparison ◽

Web Usage Mining ◽

Machine Learning Classifiers ◽

Web Usage ◽

Learning Classifiers

Download Full-text

Performance Comparison of Machine Learning Classifiers for Fake News Detection

2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) ◽

10.1109/icirca48905.2020.9183072 ◽

2020 ◽

Author(s):

N. Smitha ◽

R. Bharath

Keyword(s):

Machine Learning ◽

Performance Comparison ◽

Fake News ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Optimization of sentiment analysis using machine learning classifiers

Human-centric Computing and Information Sciences ◽

10.1186/s13673-017-0116-3 ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 26

Author(s):

Jaspreet Singh ◽

Gurvinder Singh ◽

Rajinder Singh

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Defending Malicious Script Attacks Using Machine Learning Classifiers

Wireless Communications and Mobile Computing ◽

10.1155/2017/5360472 ◽

2017 ◽

Vol 2017 ◽

pp. 1-9 ◽

Cited By ~ 6

Author(s):

Nayeem Khan ◽

Johari Abdullah ◽

Adnan Shahid Khan

Keyword(s):

Machine Learning ◽

Web Application ◽

Malicious Code ◽

Supervised Machine Learning ◽

Feature Subset ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Wrapper Method ◽

Supervised Machine Learning Classifiers ◽

Client Side

The web application has become a primary target for cyber criminals by injecting malware especially JavaScript to perform malicious activities for impersonation. Thus, it becomes an imperative to detect such malicious code in real time before any malicious activity is performed. This study proposes an efficient method of detecting previously unknown malicious java scripts using an interceptor at the client side by classifying the key features of the malicious code. Feature subset was obtained by using wrapper method for dimensionality reduction. Supervised machine learning classifiers were used on the dataset for achieving high accuracy. Experimental results show that our method can efficiently classify malicious code from benign code with promising results.

Download Full-text

A Tweet Sentiment Classification Approach Using a Hybrid Stacked Ensemble Technique

Information ◽

10.3390/info12090374 ◽

2021 ◽

Vol 12 (9) ◽

pp. 374

Author(s):

Babacar Gaye ◽

Dezheng Zhang ◽

Aziguli Wulamu

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Deep Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Short Term Memory ◽

State Of The Art ◽

Accuracy Score ◽

Learning Models ◽

Proposed Model

With the extensive availability of social media platforms, Twitter has become a significant tool for the acquisition of peoples’ views, opinions, attitudes, and emotions towards certain entities. Within this frame of reference, sentiment analysis of tweets has become one of the most fascinating research areas in the field of natural language processing. A variety of techniques have been devised for sentiment analysis, but there is still room for improvement where the accuracy and efficacy of the system are concerned. This study proposes a novel approach that exploits the advantages of the lexical dictionary, machine learning, and deep learning classifiers. We classified the tweets based on the sentiments extracted by TextBlob using a stacked ensemble of three long short-term memory (LSTM) as base classifiers and logistic regression (LR) as a meta classifier. The proposed model proved to be effective and time-saving since it does not require feature extraction, as LSTM extracts features without any human intervention. We also compared our proposed approach with conventional machine learning models such as logistic regression, AdaBoost, and random forest. We also included state-of-the-art deep learning models in comparison with the proposed model. Experiments were conducted on the sentiment140 dataset and were evaluated in terms of accuracy, precision, recall, and F1 Score. Empirical results showed that our proposed approach manifested state-of-the-art results by achieving an accuracy score of 99%.

Download Full-text