TUNING OUT HATE SPEECH ON REDDIT: AUTOMATING MODERATION AND DETECTING TOXICITY IN THE MANOSPHERE

AoIR Selected Papers of Internet Research ◽

10.5210/spir.v2020i0.11352 ◽

2020 ◽

Author(s):

Verity Trott ◽

Jennifer Beckett ◽

Venessa Paech

Keyword(s):

Machine Learning ◽

Social Media ◽

Hate Speech ◽

The Past ◽

Community Platforms ◽

Social Media Platforms ◽

Automated Machine Learning ◽

Tuning Out

Over the past two years social media platforms have been struggling to moderate at scale. At the same time, they have come under fire for failing to mitigate the risks of perceived ‘toxic’ content or behaviour on their platforms. In effort to better cope with content moderation, to combat hate speech, ‘dangerous organisations’ and other bad actors present on platforms, discussion has turned to the role that automated machine-learning (ML) tools might play. This paper contributes to thinking about the role and suitability of ML for content moderation on community platforms such as Reddit and Facebook. In particular, it looks at how ML tools operate (or fail to operate) effectively at the intersection between online sentiment within communities and social and platform expectations of acceptable discourse. Through an examination of the r/MGTOW subreddit we problematise current understandings of the notion of ‘tox¬icity’ as applied to cultural or social sub-communities online and explain how this interacts with Google’s Perspective tool.

Download Full-text

Sophisticated Hate Stratagems: Unpacking the Era of Distrust

American Behavioral Scientist ◽

10.1177/00027642211005002 ◽

2021 ◽

pp. 000276422110050

Author(s):

Rita Kirk ◽

Dan Schill

Keyword(s):

Social Media ◽

Hate Speech ◽

Body Politic ◽

Democratic Governance ◽

The Body ◽

Linguistic Expression ◽

The Past ◽

Social Media Platforms ◽

Democratic Rule ◽

Points Of View

Over the past decade, the rise of political extremism and its associated linguistic expression resulted in communication companies’ decisions to restrict hate speech and, in many cases, ban speech emanating from specific users. Before we attempt to regulate expression per se—whether through “cancelling” expression, “deplatforming” speakers through suspensions or platform restrictions, rewriting social media terms of service, or criminalizing harmful speech—we should seek a clearer understanding of how hate appeals are used to accomplish particular communication purposes. In this analysis, we analyze hate speech as a stratagem—an artifice or trick of war—used with great effect during the 2020 election. Our concern is how this tactic is used to harm the body politic, reducing citizen ability to engage with divergent publics and points of view, and threatening democratic rule. Critically, we must understand how communication on social media platforms is being used to destabilize the communication environment and prevent the robust discussion of ideas in a public forum, a prerequisite for democratic governance.

Download Full-text

Does Flagging POTUS’s Tweets Lead to Fewer or More Retweets? Preliminary Evidence from Machine Learning Models

10.31235/osf.io/69hkb ◽

2020 ◽

Author(s):

Wallace Chipidza ◽

Jie Yan

Keyword(s):

United States ◽

Machine Learning ◽

Social Media ◽

The United States ◽

Preliminary Evidence ◽

Time Of Day ◽

Word Count ◽

Government Officials ◽

The Past ◽

Social Media Platforms

There is vigorous debate as to whether influential social media platforms like Twitter and Facebook should censor objectionable posts by government officials in the United States and elsewhere. Although these platforms have resisted pressure to censor such posts in the past, Twitter recently flagged five posts by the United States President Donald J. Trump on the rationale that the tweets contained inaccurate or inflammatory content. In this paper, we examine preliminary evidence as to whether these posts were retweeted less or more than expected. We employ 10 machine learning (ML) algorithms to estimate the expected number of retweets based on 8 features of each tweet from historical data since President Trump was elected: number of likes, word count, readability, polarity, subjectivity, presence of link or multimedia content, time of day of posting, and number of days since Trump’s election. Our results indicate agreement from all 10 ML algorithms that the three flagged tweets for which we had retweet data were retweeted at higher rates than expected. These results suggest that flagging tweets by government officials might be counterproductive towards the spread of content deemed objectionable by social media platforms.

Download Full-text

Quantified Nostalgia: Social Media, Metrics, and Memory

Social Media + Society ◽

10.1177/20563051211008822 ◽

2021 ◽

Vol 7 (2) ◽

pp. 205630512110088

Author(s):

Benjamin N. Jacobsen ◽

David Beer

Keyword(s):

Social Media ◽

Focus Group ◽

Qualitative Interviews ◽

The Past ◽

Group Data ◽

Focus Group Data ◽

Social Media Platforms ◽

Memory Practices ◽

The Relationship

As social media platforms have developed over the past decade, they are no longer simply sites for interactions and networked sociality; they also now facilitate backwards glances to previous times, moments, and events. Users’ past content is turned into definable objects that can be scored, rated, and resurfaced as “memories.” There is, then, a need to understand how metrics have come to shape digital and social media memory practices, and how the relationship between memory, data, and metrics can be further understood. This article seeks to outline some of the relations between social media, metrics, and memory. It examines how metrics shape remembrance of the past within social media. Drawing on qualitative interviews as well as focus group data, the article examines the ways in which metrics are implicated in memory making and memory practices. This article explores the effect of social media “likes” on people’s memory attachments and emotional associations with the past. The article then examines how memory features incentivize users to keep remembering through accumulation. It also examines how numerating engagements leads to a sense of competition in how the digital past is approached and experienced. Finally, the article explores the tensions that arise in quantifying people’s engagements with their memories. This article proposes the notion of quantified nostalgia in order to examine how metrics are variously performative in memory making, and how regimes of ordinary measures can figure in the engagement and reconstruction of the digital past in multiple ways.

Download Full-text

Playing Politics: How Sabarimala Played Out on TikTok

American Behavioral Scientist ◽

10.1177/0002764221989769 ◽

2021 ◽

pp. 000276422198976

Author(s):

Darsana Vijay ◽

Alex Gekker

Keyword(s):

Popular Culture ◽

Political Participation ◽

Hate Speech ◽

News Coverage ◽

Political Actor ◽

The Past ◽

Indian Context ◽

Political Contention ◽

Social Media Platforms ◽

Potential Strategy

TikTok is commonly known as a playful, silly platform where teenagers share 15-second videos of crazy stunts or act out funny snippets from popular culture. In the past few years, it has experienced exponential growth and popularity, unseating Facebook as the most downloaded app. Interestingly, recent news coverage notes the emergence of TikTok as a political actor in the Indian context. They raise concerns over the abundance of divisive content, hate speech, and the lack of platform accountability in countering these issues. In this article, we analyze how politics is performed on TikTok and how the platform’s design shapes such expressions and their circulation. What does the playful architecture of TikTok mean to the nature of its political discourse and participation? To answer this, we review existing academic work on play, media, and political participation and then examine the case of Sabarimala through the double lens of ludic engagement and platform-specific features. The efficacy of play as a productive heuristic to study political contention on social media platforms is demonstrated. Finally, we turn to ludo-literacy as a potential strategy that can reveal the structures that order playful political participation and can initiate alternative modes of playing politics.

Download Full-text

Intelligent Detection of False Information in Arabic Tweets Utilizing Hybrid Harris Hawks Based Feature Selection and Machine Learning Models

Symmetry ◽

10.3390/sym13040556 ◽

2021 ◽

Vol 13 (4) ◽

pp. 556

Author(s):

Thaer Thaher ◽

Mahmoud Saheb ◽

Hamza Turabieh ◽

Hamouda Chantar

Keyword(s):

Machine Learning ◽

Social Media ◽

Feature Selection ◽

Language Processing ◽

User Profile ◽

Vital Role ◽

Classification Model ◽

Fake News ◽

False Information ◽

Social Media Platforms

Fake or false information on social media platforms is a significant challenge that leads to deliberately misleading users due to the inclusion of rumors, propaganda, or deceptive information about a person, organization, or service. Twitter is one of the most widely used social media platforms, especially in the Arab region, where the number of users is steadily increasing, accompanied by an increase in the rate of fake news. This drew the attention of researchers to provide a safe online environment free of misleading information. This paper aims to propose a smart classification model for the early detection of fake news in Arabic tweets utilizing Natural Language Processing (NLP) techniques, Machine Learning (ML) models, and Harris Hawks Optimizer (HHO) as a wrapper-based feature selection approach. Arabic Twitter corpus composed of 1862 previously annotated tweets was utilized by this research to assess the efficiency of the proposed model. The Bag of Words (BoW) model is utilized using different term-weighting schemes for feature extraction. Eight well-known learning algorithms are investigated with varying combinations of features, including user-profile, content-based, and words-features. Reported results showed that the Logistic Regression (LR) with Term Frequency-Inverse Document Frequency (TF-IDF) model scores the best rank. Moreover, feature selection based on the binary HHO algorithm plays a vital role in reducing dimensionality, thereby enhancing the learning model’s performance for fake news detection. Interestingly, the proposed BHHO-LR model can yield a better enhancement of 5% compared with previous works on the same dataset.

Download Full-text

Fuzzy based feature engineering architecture for sentiment analysis of medical discussion over online social networks

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202874 ◽

2021 ◽

pp. 1-13

Author(s):

C S Pavan Kumar ◽

L D Dhinesh Babu

Keyword(s):

Machine Learning ◽

Social Networks ◽

Social Media ◽

Sentiment Analysis ◽

Membership Function ◽

Online Social Networks ◽

Learning Model ◽

Feature Engineering ◽

Machine Learning Model ◽

Social Media Platforms

Sentiment analysis is widely used to retrieve the hidden sentiments in medical discussions over Online Social Networking platforms such as Twitter, Facebook, Instagram. People often tend to convey their feelings concerning their medical problems over social media platforms. Practitioners and health care workers have started to observe these discussions to assess the impact of health-related issues among the people. This helps in providing better care to improve the quality of life. Dementia is a serious disease in western countries like the United States of America and the United Kingdom, and the respective governments are providing facilities to the affected people. There is much chatter over social media platforms concerning the patients’ care, healthy measures to be followed to avoid disease, check early indications. These chatters have to be carefully monitored to help the officials take necessary precautions for the betterment of the affected. A novel Feature engineering architecture that involves feature-split for sentiment analysis of medical chatter over online social networks with the pipeline is proposed that can be used on any Machine Learning model. The proposed model used the fuzzy membership function in refining the outputs. The machine learning model has obtained sentiment score is subjected to fuzzification and defuzzification by using the trapezoid membership function and center of sums method, respectively. Three datasets are considered for comparison of the proposed and the regular model. The proposed approach delivered better results than the normal approach and is proved to be an effective approach for sentiment analysis of medical discussions over online social networks.

Download Full-text

A Web Interface for Analyzing Hate Speech

Future Internet ◽

10.3390/fi13030080 ◽

2021 ◽

Vol 13 (3) ◽

pp. 80

Author(s):

Lazaros Vrysis ◽

Nikolaos Vryzas ◽

Rigas Kotsakis ◽

Theodora Saridou ◽

Maria Matsiola ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Graphical User Interface ◽

Hate Speech ◽

Web Interface ◽

Learning Models ◽

Speech Detection ◽

Media Services ◽

The Web ◽

Machine Learning Models

Social media services make it possible for an increasing number of people to express their opinion publicly. In this context, large amounts of hateful comments are published daily. The PHARM project aims at monitoring and modeling hate speech against refugees and migrants in Greece, Italy, and Spain. In this direction, a web interface for the creation and the query of a multi-source database containing hate speech-related content is implemented and evaluated. The selected sources include Twitter, YouTube, and Facebook comments and posts, as well as comments and articles from a selected list of websites. The interface allows users to search in the existing database, scrape social media using keywords, annotate records through a dedicated platform and contribute new content to the database. Furthermore, the functionality for hate speech detection and sentiment analysis of texts is provided, making use of novel methods and machine learning models. The interface can be accessed online with a graphical user interface compatible with modern internet browsers. For the evaluation of the interface, a multifactor questionnaire was formulated, targeting to record the users’ opinions about the web interface and the corresponding functionality.

Download Full-text

Social Media Toxicity Classification Using Deep Learning: Real-World Application UK Brexit

Electronics ◽

10.3390/electronics10111332 ◽

2021 ◽

Vol 10 (11) ◽

pp. 1332

Author(s):

Hong Fan ◽

Wu Du ◽

Abdelghani Dahou ◽

Ahmed A. Ewees ◽

Dalia Yousri ◽

...

Keyword(s):

Social Media ◽

Hate Speech ◽

Modern Society ◽

User Generated Content ◽

Real World Application ◽

Proposed Model ◽

Social Media Platforms ◽

Public Dataset ◽

The Uk ◽

Harmful Side Effect

Social media has become an essential facet of modern society, wherein people share their opinions on a wide variety of topics. Social media is quickly becoming indispensable for a majority of people, and many cases of social media addiction have been documented. Social media platforms such as Twitter have demonstrated over the years the value they provide, such as connecting people from all over the world with different backgrounds. However, they have also shown harmful side effects that can have serious consequences. One such harmful side effect of social media is the immense toxicity that can be found in various discussions. The word toxic has become synonymous with online hate speech, internet trolling, and sometimes outrage culture. In this study, we build an efficient model to detect and classify toxicity in social media from user-generated content using the Bidirectional Encoder Representations from Transformers (BERT). The BERT pre-trained model and three of its variants has been fine-tuned on a well-known labeled toxic comment dataset, Kaggle public dataset (Toxic Comment Classification Challenge). Moreover, we test the proposed models with two datasets collected from Twitter from two different periods to detect toxicity in user-generated content (tweets) using hashtages belonging to the UK Brexit. The results showed that the proposed model can efficiently classify and analyze toxic tweets.

Download Full-text

Why it is important to consider negative ties when studying polarized debates: A signed network analysis of a Dutch cultural controversy on Twitter

PLoS ONE ◽

10.1371/journal.pone.0256696 ◽

2021 ◽

Vol 16 (8) ◽

pp. e0256696

Author(s):

Anna Keuchenius ◽

Petter Törnberg ◽

Justus Uitermark

Keyword(s):

Machine Learning ◽

Social Media ◽

Network Analysis ◽

Language Processing ◽

User Interaction ◽

User Interactions ◽

Online Interactions ◽

Network Analyses ◽

Social Media Platforms ◽

Signed Network

Despite the prevalence of disagreement between users on social media platforms, studies of online debates typically only look at positive online interactions, represented as networks with positive ties. In this paper, we hypothesize that the systematic neglect of conflict that these network analyses induce leads to misleading results on polarized debates. We introduce an approach to bring in negative user-to-user interaction, by analyzing online debates using signed networks with positive and negative ties. We apply this approach to the Dutch Twitter debate on ‘Black Pete’—an annual Dutch celebration with racist characteristics. Using a dataset of 430,000 tweets, we apply natural language processing and machine learning to identify: (i) users’ stance in the debate; and (ii) whether the interaction between users is positive (supportive) or negative (antagonistic). Comparing the resulting signed network with its unsigned counterpart, the retweet network, we find that traditional unsigned approaches distort debates by conflating conflict with indifference, and that the inclusion of negative ties changes and enriches our understanding of coalitions and division within the debate. Our analysis reveals that some groups are attacking each other, while others rather seem to be located in fragmented Twitter spaces. Our approach identifies new network positions of individuals that correspond to roles in the debate, such as leaders and scapegoats. These findings show that representing the polarity of user interactions as signs of ties in networks substantively changes the conclusions drawn from polarized social media activity, which has important implications for various fields studying online debates using network analysis.

Download Full-text

Using Global Terrorism Database (GTD) and Machine Learning Algorithms to Predict Terrorism and Threat

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1768.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 5995-6000 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Social Media ◽

Intelligent System ◽

Machine Learning Algorithms ◽

Internet Technology ◽

Social Media Platforms ◽

Future Events ◽

Global Terrorism ◽

Terrorist Event ◽

Global Terrorism Database

It is evident that there has been enormous growth in terrorist attacks in recent years. The idea of online terrorism has also been growing its roots in the internet world. These types of activities have been growing along with the growth in internet technology. These types of events include social media threats such as hate speeches and comments provoking terror on social media platforms such as twitter, Facebook, etc. These activities must be prevented before it makes an impact. In this paper, we will make various classifiers that will group and predict various terrorism activities using k-NN algorithm and random forest algorithm. The purpose of this project is to use Global Terrorism Database as a dataset to detect terrorism. We will be using GTD which stands for Global Terrorism Database which is a publicly available database which contains information on terrorist event far and wide from 1970 through 2017 to train a machine learning-based intelligent system to predict any future events that could bring threat to the society.

Download Full-text