Survey Paper on Hybrid Approach for Twitter Sentiment Analysis using Supervised Machine Learning Algorithms

In the fields of Internet of Things (IoT) infrastructures, attack and anomaly detection are rising concerns. With the increased use of IoT infrastructure in every domain, threats and attacks in these infrastructures are also growing proportionally. In this paper the performances of several machine learning algorithms in identifying cyber-attacks (namely SYN-DOS attacks) to IoT systems are compared both in terms of application performances, and in training/application times. We use supervised machine learning algorithms included in the MLlib library of Apache Spark, a fast and general engine for big data processing. We show the implementation details and the performance of those algorithms on public datasets using a training set of up to 2 million instances. We adopt a Cloud environment, emphasizing the importance of the scalability and of the elasticity of use. Results show that all the Spark algorithms used result in a very good identification accuracy (>99%). Overall, one of them, Random Forest, achieves an accuracy of 1. We also report a very short training time (23.22 sec for Decision Tree with 2 million rows). The experiments also show a very low application time (0.13 sec for over than 600,000 instances for Random Forest) using Apache Spark in the Cloud. Furthermore, the explicit model generated by Random Forest is very easy-to-implement using high- or low-level programming languages. In light of the results obtained, both in terms of computation times and identification performance, a hybrid approach for the detection of SYN-DOS cyber-attacks on IoT devices is proposed: the application of an explicit Random Forest model, implemented directly on the IoT device, along with a second level analysis (training) performed in the Cloud.

Download Full-text

US Based COVID-19 Tweets Sentiment Analysis Using TextBlob and Supervised Machine Learning Algorithms

2021 International Conference on Artificial Intelligence (ICAI) ◽

10.1109/icai52203.2021.9445207 ◽

2021 ◽

Author(s):

Rashid Khan ◽

Furqan Rustam ◽

Khadija Kanwal ◽

Arif Mehmood ◽

Gyu Sang Choi

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning

Download Full-text

Comparison of various Supervised Machine Learning Algorithms in Sentiment Analysis of Tweets on Black Fungus

10.1109/icccnt51525.2021.9580094 ◽

2021 ◽

Author(s):

S Preethi ◽

G Ashvika Saroja

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning

Download Full-text

A feature-centric spam email detection model using diverse supervised machine learning algorithms

The Electronic Library ◽

10.1108/el-07-2019-0181 ◽

2020 ◽

Vol 38 (3) ◽

pp. 633-657

Author(s):

Ammara Zamir ◽

Hikmat Ullah Khan ◽

Waqar Mehmood ◽

Tassawar Iqbal ◽

Abubakker Usman Akram

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Classification Accuracy ◽

Research Study ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Content Type ◽

Detection Model ◽

Proposed Model

Purpose This research study proposes a feature-centric spam email detection model (FSEDM) based on content, sentiment, semantic, user and spam-lexicon features set. The purpose of this study is to exploit the role of sentiment features along with other proposed features to evaluate the classification accuracy of machine learning algorithms for spam email detection. Design/methodology/approach Existing studies primarily exploits content-based feature engineering approach; however, a limited number of features is considered. In this regard, this research study proposed a feature-centric framework (FSEDM) based on existing and novel features of email data set, which are extracted after pre-processing. Afterwards, diverse supervised learning techniques are applied on the proposed features in conjunction with feature selection techniques such as information gain, gain ratio and Relief-F to rank most prominent features and classify the emails into spam or ham (not spam). Findings Analysis and experimental results indicated that the proposed model with sentiment analysis is competitive approach for spam email detection. Using the proposed model, deep neural network applied with sentiment features outperformed other classifiers in terms of classification accuracy up to 97.2%. Originality/value This research is novel in this regard that no previous research focuses on sentiment analysis in conjunction with other email features for detection of spam emails.

Download Full-text

Sentiment Analysis: A Comparative Study of Supervised Machine Learning Algorithms Using Rapid miner

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2017.11011 ◽

2017 ◽

Vol V (XI) ◽

pp. 80-89 ◽

Cited By ~ 6

Author(s):

Priyavrat Chauhan

Keyword(s):

Machine Learning ◽

Comparative Study ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning

Download Full-text

Text Polarity Detection using Multiple Supervised Machine Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8449.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 1612-1618

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Large Scale ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

The Public ◽

Social Media Platforms ◽

Day By Day

Sentiment analysis is the classifying of a review, opinion or a statement into categories, which brings clarity about specific sentiments of customers or the concerned group to businesses and developers. These categorized data are very critical to the development of businesses and understanding the public opinion. The need for accurate opinion and large-scale sentiment analysis on social media platforms is growing day by day. In this paper, a number of machine learning algorithms are trained and applied on twitter datasets and their respective accuracies are determined separately on different polarities of data, thereby giving a glimpse to which algorithm works best and which works worst..

Download Full-text