Rumor Detection on Twitter Using a Supervised Machine Learning Framework

Author(s):  
Hardeo Kumar Thakur ◽  
Anand Gupta ◽  
Ayushi Bhardwaj ◽  
Devanshi Verma

This article describes how a rumor can be defined as a circulating unverified story or a doubtful truth. Rumor initiators seek social networks vulnerable to illimitable spread, therefore, online social media becomes their stage. Hence, this misinformation imposes colossal damage to individuals, organizations, and the government, etc. Existing work, analyzing temporal and linguistic characteristics of rumors seems to give ample time for rumor propagation. Meanwhile, with the huge outburst of data on social media, studying these characteristics for each tweet becomes spatially complex. Therefore, in this article, a two-fold supervised machine-learning framework is proposed that detects rumors by filtering and then analyzing their linguistic properties. This method attempts to automate filtering by training multiple classification algorithms with accuracy higher than 81.079%. Finally, using textual characteristics on the filtered data, rumors are detected. The effectiveness of the proposed framework is shown through extensive experiments on over 10,000 tweets.

2018 ◽  
Vol 8 (3) ◽  
pp. 1-13 ◽  
Author(s):  
Hardeo Kumar Thakur ◽  
Anand Gupta ◽  
Ayushi Bhardwaj ◽  
Devanshi Verma

This article describes how a rumor can be defined as a circulating unverified story or a doubtful truth. Rumor initiators seek social networks vulnerable to illimitable spread, therefore, online social media becomes their stage. Hence, this misinformation imposes colossal damage to individuals, organizations, and the government, etc. Existing work, analyzing temporal and linguistic characteristics of rumors seems to give ample time for rumor propagation. Meanwhile, with the huge outburst of data on social media, studying these characteristics for each tweet becomes spatially complex. Therefore, in this article, a two-fold supervised machine-learning framework is proposed that detects rumors by filtering and then analyzing their linguistic properties. This method attempts to automate filtering by training multiple classification algorithms with accuracy higher than 81.079%. Finally, using textual characteristics on the filtered data, rumors are detected. The effectiveness of the proposed framework is shown through extensive experiments on over 10,000 tweets.


Author(s):  
V.T Priyanga ◽  
J.P Sanjanasri ◽  
Vijay Krishna Menon ◽  
E.A Gopalakrishnan ◽  
K.P Soman

The widespread use of social media like Facebook, Twitter, Whatsapp, etc. has changed the way News is created and published; accessing news has become easy and inexpensive. However, the scale of usage and inability to moderate the content has made social media, a breeding ground for the circulation of fake news. Fake news is deliberately created either to increase the readership or disrupt the order in the society for political and commercial benefits. It is of paramount importance to identify and filter out fake news especially in democratic societies. Most existing methods for detecting fake news involve traditional supervised machine learning which has been quite ineffective. In this paper, we are analyzing word embedding features that can tell apart fake news from true news. We use the LIAR and ISOT data set. We churn out highly correlated news data from the entire data set by using cosine similarity and other such metrices, in order to distinguish their domains based on central topics. We then employ auto-encoders to detect and differentiate between true and fake news while also exploring their separability through network analysis.


2021 ◽  
pp. 1-13
Author(s):  
C S Pavan Kumar ◽  
L D Dhinesh Babu

Sentiment analysis is widely used to retrieve the hidden sentiments in medical discussions over Online Social Networking platforms such as Twitter, Facebook, Instagram. People often tend to convey their feelings concerning their medical problems over social media platforms. Practitioners and health care workers have started to observe these discussions to assess the impact of health-related issues among the people. This helps in providing better care to improve the quality of life. Dementia is a serious disease in western countries like the United States of America and the United Kingdom, and the respective governments are providing facilities to the affected people. There is much chatter over social media platforms concerning the patients’ care, healthy measures to be followed to avoid disease, check early indications. These chatters have to be carefully monitored to help the officials take necessary precautions for the betterment of the affected. A novel Feature engineering architecture that involves feature-split for sentiment analysis of medical chatter over online social networks with the pipeline is proposed that can be used on any Machine Learning model. The proposed model used the fuzzy membership function in refining the outputs. The machine learning model has obtained sentiment score is subjected to fuzzification and defuzzification by using the trapezoid membership function and center of sums method, respectively. Three datasets are considered for comparison of the proposed and the regular model. The proposed approach delivered better results than the normal approach and is proved to be an effective approach for sentiment analysis of medical discussions over online social networks.


2020 ◽  
Vol 30 (11n12) ◽  
pp. 1759-1777
Author(s):  
Jialing Liang ◽  
Peiquan Jin ◽  
Lin Mu ◽  
Jie Zhao

With the development of Web 2.0, social media such as Twitter and Sina Weibo have become an essential platform for disseminating hot events. Simultaneously, due to the free policy of microblogging services, users can post user-generated content freely on microblogging platforms. Accordingly, more and more hot events on microblogging platforms have been labeled as spammers. Spammers will not only hurt the healthy development of social media but also introduce many economic and social problems. Therefore, the government and enterprises must distinguish whether a hot event on microblogging platforms is a spammer or is a naturally-developing event. In this paper, we focus on the hot event list on Sina Weibo and collect the relevant microblogs of each hot event to study the detecting methods of spammers. Notably, we develop an integral feature set consisting of user profile, user behavior, and user relationships to reflect various factors affecting the detection of spammers. Then, we employ typical machine learning methods to conduct extensive experiments on detecting spammers. We use a real data set crawled from the most prominent Chinese microblogging platform, Sina Weibo, and evaluate the performance of 10 machine learning models with five sampling methods. The results in terms of various metrics show that the Random Forest model and the over-sampling method achieve the best accuracy in detecting spammers and non-spammers.


2020 ◽  
Vol 34 (10) ◽  
pp. 13971-13972
Author(s):  
Yang Qi ◽  
Farseev Aleksandr ◽  
Filchenkov Andrey

Nowadays, social networks play a crucial role in human everyday life and no longer purely associated with spare time spending. In fact, instant communication with friends and colleagues has become an essential component of our daily interaction giving a raise of multiple new social network types emergence. By participating in such networks, individuals generate a multitude of data points that describe their activities from different perspectives and, for example, can be further used for applications such as personalized recommendation or user profiling. However, the impact of the different social media networks on machine learning model performance has not been studied comprehensively yet. Particularly, the literature on modeling multi-modal data from multiple social networks is relatively sparse, which had inspired us to take a deeper dive into the topic in this preliminary study. Specifically, in this work, we will study the performance of different machine learning models when being learned on multi-modal data from different social networks. Our initial experimental results reveal that social network choice impacts the performance and the proper selection of data source is crucial.


Author(s):  
Asdrúbal López Chau ◽  
David Valle-Cruz ◽  
Rodrigo Sandoval-Almazán

One of the pillars of connected government is citizen centricity: an approach in which citizen participation is essential. In Mexico, social networks are currently one of the most important means by which citizens express their needs and provide opinions to the government. The goal of this chapter is to contribute to citizen centricity by adapting the methodology of sentiment analysis of social media posts to an expanded version for crisis situations. The main difference in this approach from the normally accepted one is that instead of using pre-defined classes (positive and negative) for sentiments, the authors first determined the different data categories and then applied them to the classic process of sentiment analysis. This approach was tested using posts on Mexico's earthquake in 2017. They found that needs, demands, and claims made in the posts reflect sentiments in a better way, and this can help to improve the government-citizen connection.


Sign in / Sign up

Export Citation Format

Share Document