scholarly journals COVID19 Sentiment Analysis using Machine Learning Classification Algorithms

Author(s):  
Kusumanchi Naga Sireesha and Padala Srinivasa Reddy

Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fuelled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19’s informational crisis. The diverse use of social networking sites, like Twitter, speeds up the process of sharing information and having views on community events and health crises COVID-19 has been one of Twitter's trending areas. The Twitter messages created via Twitter are named Tweets. In this paper, we identify public sentiment associated with the pandemic using Coronavirus-specific Tweets and Python, along with its sentiment analysis packages. We provide an overview of two essential machine learning classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. This research provides insights into Coronavirus fear sentiment progression, associated methods, limitations, and different opportunities. In this project, we have designed a Sentiment analysis System that would identify the sentiment of a tweet and classify it into one of the five classes they include:”ExtremelyPositive”,“Positive”,”Neutral”, ”Negative” and “Extremely Negative”.

In today’s world, people are usually using social media networks for trying to communicate with other users and for sharing information across the world. The online social networking sites have become considerable tools and are providing a common medium for a number of users to communicate with each other. Twitter is the most prominent microblogging website and one among the social networking sites that grow on a daily basis. Social media incorporates an extensive amount of data in the form of tweets, forums, status updates, comments, etc. in an attempt to automatically process and analyze these data, applications can rely on analysis approaches such as sentiment analysis. Twitter sentiment analysis is an application of sentiment analysis on data from Twitter (tweets), to obtain user's opinions and sentiments. Natural Language Toolkit (NLTK) is a library based on machine learning methods in python & sentiment analysis tool. Which provides the base for text processing and classification? The research work proposed a machine learning-based classifier to extract the tweets on elections and analyze the opinion of the tweeples (people who use twitter). The tweets can be categorized as positive, negative and neutral towards a particular politician. We classify these processed tweets using a supervised machine learning classification approach. The classifier used to classify the tweets as positive, negative or neutral is Naive Bayes Classifier. The classifier is trained with tweets bearing a distinctive polarity. The percentage of positive and negative tweets is then measured and graphically represented.


Information ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 314 ◽  
Author(s):  
Jim Samuel ◽  
G. G. Md. Nawaz Ali ◽  
Md. Mokhlesur Rahman ◽  
Ek Esawi ◽  
Yana Samuel

Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fueled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19’s informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning (ML) classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91% for short Tweets, with the Naïve Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities.


2020 ◽  
Author(s):  
Jim Samuel ◽  
G. G. Md. Nawaz Ali ◽  
Md. Mokhlesur Rahman ◽  
Ek Esawi ◽  
Yana Samuel

AbstractAlong with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fueled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19’s informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning (ML) classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91% for short Tweets, with the Naïve Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities.


2021 ◽  
Vol 5 (1) ◽  
pp. 566-576
Author(s):  
Azeez A. Nureni ◽  
Victor E. Ogunlusi ◽  
Emmanuel Junior Uloko

Sentiment analysis involves techniques used in analyzing texts in order to identify the sentiment and emotion dominant in such texts and classify them accordingly. Techniques involved include but not limited to preprocessing of texts and the use a machine learning or lexical based approach in classifying these texts. In this research, attempt was made to adopt a machine learning approach to classify tweets on Covid-19 which is considered a global pandemic. To achieve this noble objective, a cross-dataset approach was applied to train four machine learning classification algorithms: Support Vector Machine (SVM), Random Forest (RF) and Naïve Bayes (NB), as well as K-Nearest Neighbors algorithm (KNN). The final result will not only assist us in knowing the best performing algorithm, it will also assist in creating awareness on Covid-19 with the final objective of destigmatizing the patients through the analysis of sentiments and emotions on Covid-19  and finally use the same result for containing the spread of the pandemic


Author(s):  
Jim Samuel ◽  
G. G. Md. Nawaz Ali ◽  
Md. Mokhlesur Rahman ◽  
Ek Esawi ◽  
Yana Samuel

Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fuelled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19's informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91% for short Tweets, with the Naive Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities.


2020 ◽  
Author(s):  
Jim Samuel ◽  
Md. Mokhlesur Rahman ◽  
G.G.M.N. Ali ◽  
Ek Esawi ◽  
Y. Samuel

Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fueled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19's informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning (ML) classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91\% for short Tweets, with the Na\"ive Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74\% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities


Sign in / Sign up

Export Citation Format

Share Document