Intelligent Detection of False Information in Arabic Tweets Utilizing Hybrid Harris Hawks Based Feature Selection and Machine Learning Models

Fake or false information on social media platforms is a significant challenge that leads to deliberately misleading users due to the inclusion of rumors, propaganda, or deceptive information about a person, organization, or service. Twitter is one of the most widely used social media platforms, especially in the Arab region, where the number of users is steadily increasing, accompanied by an increase in the rate of fake news. This drew the attention of researchers to provide a safe online environment free of misleading information. This paper aims to propose a smart classification model for the early detection of fake news in Arabic tweets utilizing Natural Language Processing (NLP) techniques, Machine Learning (ML) models, and Harris Hawks Optimizer (HHO) as a wrapper-based feature selection approach. Arabic Twitter corpus composed of 1862 previously annotated tweets was utilized by this research to assess the efficiency of the proposed model. The Bag of Words (BoW) model is utilized using different term-weighting schemes for feature extraction. Eight well-known learning algorithms are investigated with varying combinations of features, including user-profile, content-based, and words-features. Reported results showed that the Logistic Regression (LR) with Term Frequency-Inverse Document Frequency (TF-IDF) model scores the best rank. Moreover, feature selection based on the binary HHO algorithm plays a vital role in reducing dimensionality, thereby enhancing the learning model’s performance for fake news detection. Interestingly, the proposed BHHO-LR model can yield a better enhancement of 5% compared with previous works on the same dataset.

Download Full-text

An Effecient Fake News Detection System Using Machine Learning

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9453.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 3125-3129 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Social Media ◽

Language Processing ◽

Negative Impact ◽

Detection System ◽

Vital Role ◽

Machine Learning Algorithms ◽

Easy Access ◽

Fake News ◽

K Nearest Neighbors

Social media plays a major role in several things in our life. Social media helps all of us to find some important news with low price. It also provides easy access in less time. But sometimes social media gives a chance for the fast-spreading of fake news. So there is a possibility that less quality news with false information is spread through the social media. This shows a negative impact on the number of people. Sometimes it may impact society also. So, detection of fake news has vast importance. Machine learning algorithms play a vital role in fake news detection; Especially NLP (Natural Language Processing) algorithms are very useful for detecting the fake news. In this paper, we employed machine learning classifiers SVM, K-Nearest Neighbors, Decision tree, Random forest. By using these classifiers we successfully build a model to detect fake news from the given dataset. Python language was used for experiments.

Download Full-text

Why it is important to consider negative ties when studying polarized debates: A signed network analysis of a Dutch cultural controversy on Twitter

PLoS ONE ◽

10.1371/journal.pone.0256696 ◽

2021 ◽

Vol 16 (8) ◽

pp. e0256696

Author(s):

Anna Keuchenius ◽

Petter Törnberg ◽

Justus Uitermark

Keyword(s):

Machine Learning ◽

Social Media ◽

Network Analysis ◽

Language Processing ◽

User Interaction ◽

User Interactions ◽

Online Interactions ◽

Network Analyses ◽

Social Media Platforms ◽

Signed Network

Despite the prevalence of disagreement between users on social media platforms, studies of online debates typically only look at positive online interactions, represented as networks with positive ties. In this paper, we hypothesize that the systematic neglect of conflict that these network analyses induce leads to misleading results on polarized debates. We introduce an approach to bring in negative user-to-user interaction, by analyzing online debates using signed networks with positive and negative ties. We apply this approach to the Dutch Twitter debate on ‘Black Pete’—an annual Dutch celebration with racist characteristics. Using a dataset of 430,000 tweets, we apply natural language processing and machine learning to identify: (i) users’ stance in the debate; and (ii) whether the interaction between users is positive (supportive) or negative (antagonistic). Comparing the resulting signed network with its unsigned counterpart, the retweet network, we find that traditional unsigned approaches distort debates by conflating conflict with indifference, and that the inclusion of negative ties changes and enriches our understanding of coalitions and division within the debate. Our analysis reveals that some groups are attacking each other, while others rather seem to be located in fragmented Twitter spaces. Our approach identifies new network positions of individuals that correspond to roles in the debate, such as leaders and scapegoats. These findings show that representing the polarity of user interactions as signs of ties in networks substantively changes the conclusions drawn from polarized social media activity, which has important implications for various fields studying online debates using network analysis.

Download Full-text

Fake and Real News detection Using Python

International Journal of Scientific Research in Science and Technology ◽

10.32628/ijsrst207376 ◽

2020 ◽

pp. 423-428

Author(s):

Fakhra Akhtar ◽

Faizan Ahmed Khan

Keyword(s):

Social Media ◽

Fake News ◽

False Information ◽

False Negatives ◽

Mainstream Media ◽

Social Media Platforms ◽

The Masses ◽

Negative Attributes ◽

Entire Dataset ◽

F Measure

In the digital age, fake news has become a well-known phenomenon. The spread of false evidence is often used to confuse mainstream media and political opponents, and can lead to social media wars, hatred arguments and debates.Fake news is blurring the distinction between real and false information, and is often spread on social media resulting in negative views and opinions. Earlier Research describe the fact that false propaganda is used to create false stories on mainstream media in order to cause a revolt and tension among the masses The digital rights foundation DRF report, which builds on the experiences of 152 journalists and activists in Pakistan, presents that more than 88 % of the participants find social media platforms as the worst source for information, with Facebook being the absolute worst. The dataset used in this paper relates to Real and fake news detection. The objective of this paper is to determine the Accuracy , precision , of the entire dataset .The results are visualized in the form of graphs and the analysis was done using python. The results showed the fact that the dataset holds 95% of the accuracy. The number of actual predicted cases were 296. Results of this paper reveals that The accuracy of the model dataset is 95.26 % the precision results 95.79 % whereas recall and F-Measure shows 94.56% and 95.17% accuracy respectively.Whereas in predicted models there are 296 positive attributes , 308 negative attributes 17 false positives and 13 false negatives. This research recommends that authenticity of news should be analysed first instead of drafting an opinion, sharing fake news or false information is considered unethical journalists and news consumers both should act responsibly while sharing any news.

Download Full-text

Fake News Finds an Audience

Journalism and Truth in an Age of Social Media ◽

10.1093/oso/9780190900250.003.0014 ◽

2019 ◽

pp. 201-222

Author(s):

Erik P. Bucy ◽

John E. Newhagen

Keyword(s):

Social Media ◽

Big Data ◽

Information Processing ◽

Educational Strategy ◽

Fake News ◽

False Information ◽

User Experiences ◽

The Social ◽

Social Media Platforms ◽

Media Systems

The vulnerabilities shown by media systems and individual users exposed to attacks on truth from fake news and computational propaganda in recent years should be considered in light of the characteristics and concerns surrounding big data, especially the volume and velocity of messages delivered over social media platforms that tax the average user’s capacity to determine their truth value in real time. For reasons explained by the psychology of information processing, a high percentage of fake news that reaches audiences is accepted as true, particularly when distractions and interruptions typify user experiences with technology. As explained in this essay, fake news thrives in environments lacking editorial policing and epistemological vigilance, making the social media milieu ideally suited for spreading false information. In response, we suggest the value of an educational strategy to combat the dilemma that digital disinformation poses to informed citizenship.

Download Full-text

Ternion: An Autonomous Model for Fake News Detection

Applied Sciences ◽

10.3390/app11199292 ◽

2021 ◽

Vol 11 (19) ◽

pp. 9292

Author(s):

Noman Islam ◽

Asadullah Shaikh ◽

Asma Qaiser ◽

Yousef Asiri ◽

Sultan Almakdi ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Support Vector Machine ◽

Logistic Regression ◽

Language Processing ◽

Negative Impact ◽

Machine Learning Techniques ◽

Support Vector ◽

Fake News ◽

Processing Techniques

In recent years, the consumption of social media content to keep up with global news and to verify its authenticity has become a considerable challenge. Social media enables us to easily access news anywhere, anytime, but it also gives rise to the spread of fake news, thereby delivering false information. This also has a negative impact on society. Therefore, it is necessary to determine whether or not news spreading over social media is real. This will allow for confusion among social media users to be avoided, and it is important in ensuring positive social development. This paper proposes a novel solution by detecting the authenticity of news through natural language processing techniques. Specifically, this paper proposes a novel scheme comprising three steps, namely, stance detection, author credibility verification, and machine learning-based classification, to verify the authenticity of news. In the last stage of the proposed pipeline, several machine learning techniques are applied, such as decision trees, random forest, logistic regression, and support vector machine (SVM) algorithms. For this study, the fake news dataset was taken from Kaggle. The experimental results show an accuracy of 93.15%, precision of 92.65%, recall of 95.71%, and F1-score of 94.15% for the support vector machine algorithm. The SVM is better than the second best classifier, i.e., logistic regression, by 6.82%.

Download Full-text

Detecting Fake News using Machine Learning Algorithms

10.36227/techrxiv.12089133 ◽

2020 ◽

Author(s):

Harika Kudarvalli ◽

Jinan Fiaidhi

Keyword(s):

Social Media ◽

Real Time ◽

Short Term Memory ◽

Vital Role ◽

Machine Learning Algorithms ◽

Support Vector ◽

Fake News ◽

Time Data ◽

Time News ◽

Social Media Platforms

Spreading fake news has become a serious issue in the current social media world. It is broadcasted with dishonest intentions to mislead people. This has caused many unfortunate incidents in different countries. The most recent one was the latest presidential elections where the voters were mis lead to support a leader. Twitter is one of the most popular social media platforms where users look up for real time news. We extracted real time data on multiple domains through twitter and performed analysis. The dataset was preprocessed and user_verified column played a vital role. Multiple machine algorithms were then performed on the extracted features from preprocessed dataset. Logistic Regression and Support Vector Machine had promising results with both above 92% accuracy. Naive Bayes and Long-Short Term memory didn't achieve desired accuracies. The model can also be applied to images and videos for better detection of fake news.

Download Full-text

Sentiment Analysis for Social Media using SVM Classifier of Machine Learning

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i1107.0789s419 ◽

2019 ◽

Vol 8 (9S4) ◽

pp. 39-47

Keyword(s):

Machine Learning ◽

Social Media ◽

Feature Selection ◽

Sentiment Analysis ◽

Language Processing ◽

Cuckoo Search ◽

Support Vector ◽

Svm Classifier ◽

Feature Selection Technique ◽

Performance Factors

Sentiment analysis is an area of natural language processing (NLP) and machine learning where the text is to be categorized into predefined classes i.e. positive and negative. As the field of internet and social media, both are increasing day by day, the product of these two nowadays is having many more feedbacks from the customer than before. Text generated through social media, blogs, post, review on any product, etc. has become the bested suited cases for consumer sentiment, providing a best-suited idea for that particular product. Features are an important source for the classification task as more the features are optimized, the more accurate are results. Therefore, this research paper proposes a hybrid feature selection which is a combination of Particle swarm optimization (PSO) and cuckoo search. Due to the subjective nature of social media reviews, hybrid feature selection technique outperforms the traditional technique. The performance factors like f-measure, recall, precision, and accuracy tested on twitter dataset using Support Vector Machine (SVM) classifier and compared with convolution neural network. Experimental results of this paper on the basis of different parameters show that the proposed work outperforms the existing work

Download Full-text

A Pinnacle Technique for Detection of COVID-19 Fake News in Social Media

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a8176.1110120 ◽

2020 ◽

Vol 10 (1) ◽

pp. 256-261

Keyword(s):

Machine Learning ◽

Social Media ◽

Random Forest ◽

General Public ◽

Classification Model ◽

Fake News ◽

Random Forest Classification ◽

Forest Classification ◽

The World ◽

The Impact

Today the world is gripped with fear of the most infectious disease which was caused by a newly discovered virus namely corona and thus termed as COVID-19. This is a large group of viruses which severely affects humans. The world bears testimony to its contagious nature and rapidity of spreading the illness. 50l people got infected and 30l people died due to this pandemic all around the world. This made a wide impact for people to fear the epidemic around them. The death rate of male is more compared to female. This Pandemic news has caught the attention of the world and gained its momentum in almost all the media platforms. There was an array of creating and spreading of true as well as fake news about COVID-19 in the social media, which has become popular and a major concern to the general public who access it. Spreading such hot news in social media has become a new trend in acquiring familiarity and fan base. At the time it is undeniable that spreading of such fake news in and around creates lots of confusion and fear to the public. To stop all such rumors detection of fake news has become utmost important. To effectively detect the fake news in social media the emerging machine learning classification algorithms can be an appropriate method to frame the model. In the context of the COVID-19 pandemic, we investigated and implemented by collecting the training data and trained a machine learning model by using various machine learning algorithms to automatically detect the fake news about the Corona Virus. The machine learning algorithm used in this investigation is Naïve Bayes classifier and Random forest classification algorithm for the best results. A separate model for each classifier is created after the data preparation and feature extraction Techniques. The results obtained are compared and examined accurately to evaluate the accurate model. Our experiments on a benchmark dataset with random forest classification model showed a promising results with an overall accuracy of 94.06%. This experimental evaluation will prevent the general public to keep themselves out of their fear and to know and understand the impact of fast-spreading as well as misleading fake news.

Download Full-text

Using Tweets to Understand How COVID-19–Related Health Beliefs Are Affected in the Age of Social Media: Twitter Data Analysis Study (Preprint)

10.2196/preprints.26302 ◽

2020 ◽

Author(s):

Hanyin Wang ◽

Yikuan Li ◽

Meghan Hutch ◽

Andrew Naidech ◽

Yuan Luo

Keyword(s):

Machine Learning ◽

Social Media ◽

Health Beliefs ◽

Language Processing ◽

Large Population ◽

Health Belief ◽

Perceived Barriers ◽

Machine Learning Techniques ◽

Perceived Benefits ◽

Social Media Platforms

BACKGROUND The emergence of SARS-CoV-2 (ie, COVID-19) has given rise to a global pandemic affecting 215 countries and over 40 million people as of October 2020. Meanwhile, we are also experiencing an infodemic induced by the overabundance of information, some accurate and some inaccurate, spreading rapidly across social media platforms. Social media has arguably shifted the information acquisition and dissemination of a considerably large population of internet users toward higher interactivities. OBJECTIVE This study aimed to investigate COVID-19-related health beliefs on one of the mainstream social media platforms, Twitter, as well as potential impacting factors associated with fluctuations in health beliefs on social media. METHODS We used COVID-19-related posts from the mainstream social media platform Twitter to monitor health beliefs. A total of 92,687,660 tweets corresponding to 8,967,986 unique users from January 6 to June 21, 2020, were retrieved. To quantify health beliefs, we employed the health belief model (HBM) with four core constructs: perceived susceptibility, perceived severity, perceived benefits, and perceived barriers. We utilized natural language processing and machine learning techniques to automate the process of judging the conformity of each tweet with each of the four HBM constructs. A total of 5000 tweets were manually annotated for training the machine learning architectures. RESULTS The machine learning classifiers yielded areas under the receiver operating characteristic curves over 0.86 for the classification of all four HBM constructs. Our analyses revealed a basic reproduction number R0 of 7.62 for trends in the number of Twitter users posting health belief–related content over the study period. The fluctuations in the number of health belief–related tweets could reflect dynamics in case and death statistics, systematic interventions, and public events. Specifically, we observed that scientific events, such as scientific publications, and nonscientific events, such as politicians’ speeches, were comparable in their ability to influence health belief trends on social media through a Kruskal-Wallis test (P=.78 and P=.92 for perceived benefits and perceived barriers, respectively). CONCLUSIONS As an analogy of the classic epidemiology model where an infection is considered to be spreading in a population with an R0 greater than 1, we found that the number of users tweeting about COVID-19 health beliefs was amplifying in an epidemic manner and could partially intensify the infodemic. It is “unhealthy” that both scientific and nonscientific events constitute no disparity in impacting the health belief trends on Twitter, since nonscientific events, such as politicians’ speeches, might not be endorsed by substantial evidence and could sometimes be misleading.

Download Full-text

Combating Fake News, Misinformation, and Machine Learning Generated Fakes: Insight's from the Islamic Ethical Tradition

ICR Journal ◽

10.52282/icr.v10i2.42 ◽

2019 ◽

Vol 10 (2) ◽

pp. 189-212

Author(s):

Talat Zubair ◽

Amana Raquib ◽

Junaid Qadir

Keyword(s):

Machine Learning ◽

Social Media ◽

Deep Learning ◽

World Wide ◽

Security And Privacy ◽

Fake News ◽

Islamic Ethics ◽

The World ◽

Social Media Platforms ◽

Filter Bubbles

The growing trend of sharing and acquiring news through social media platforms and the World Wide Web has impacted individuals as well as societies, spreading misinformation and disinformation. This trend—along with rapid developments in the field of machine learning, particularly with the emergence of techniques such as deep learning that can be used to generate data—has grave political, social, ethical, security, and privacy implications for society. This paper discusses the technologies that have led to the rise of problems such as fake news articles, filter bubbles, social media bots, and deep-fake videos, and their implications, while providing insights from the Islamic ethical tradition that can aid in mitigating them. We view these technologies and artifacts through the Islamic lens, concluding that they violate the commandment of spreading truth and countering falsehood. We present a set of guidelines, with reference to Qur‘anic and Prophetic teachings and the practices of the early Muslim scholars, on countering deception, putting forward ideas on developing these technologies while keeping Islamic ethics in perspective.

Download Full-text