scholarly journals Multi-Ideology ISIS/Jihadist White Supremacist (MIWS) Dataset for Multi-Class Extremism Text Classification

Data ◽  
2021 ◽  
Vol 6 (11) ◽  
pp. 117
Author(s):  
Mayur Gaikwad ◽  
Swati Ahirrao ◽  
Shraddha Phansalkar ◽  
Ketan Kotecha

Social media platforms are a popular choice for extremist organizations to disseminate their perceptions, beliefs, and ideologies. This information is generally based on selective reporting and is subjective in content. However, the radical presentation of this disinformation and its outreach on social media leads to an increased number of susceptible audiences. Hence, detection of extremist text on social media platforms is a significant area of research. The unavailability of extremism text datasets is a challenge in online extremism research. The lack of emphasis on classifying extremism text into propaganda, radicalization, and recruitment classes is a challenge. The lack of data validation methods also challenges the accuracy of extremism detection. This research addresses these challenges and presents a seed dataset with a multi-ideology and multi-class extremism text dataset. This research presents the construction of a multi-ideology ISIS/Jihadist White supremacist (MIWS) dataset with recent tweets collected from Twitter. The presented dataset can be employed effectively and importantly to classify extremist text into popular types like propaganda, radicalization, and recruitment. Additionally, the seed dataset is statistically validated with a coherence score of Latent Dirichlet Allocation (LDA) and word mover’s distance using a pretrained Google News vector. The dataset shows effectiveness in its construction with good coherence scores within a topic and appropriate distance measures between topics. This dataset is the first publicly accessible multi-ideology, multi-class extremism text dataset to reinforce research on extremism text detection on social media platforms.

2021 ◽  
pp. 147078532110475
Author(s):  
Manit Mishra

The ubiquity of social media platforms facilitates free flow of online chatter related to customer experience. Twitter is a prominent social media platform for sharing experiences, and e-retail firms are rapidly emerging as the preferred shopping destination. This study explores customers’ online shopping experience tweets. Customers tweet about their online shopping experience based on moments of truth shaped by encounters across different touchpoints. We aggregate 25,173 such tweets related to six e-retailers tweeted over a 5-year period. Grounded on agency theory, we extract the topics underlying these customer experience tweets using unsupervised latent Dirichlet allocation. The output reveals five topics which manifest into customer experience tweets related to online shopping—ordering, customer service interaction, entertainment, service outcome failure, and service process failure. Topics extracted are validated through inter-rater agreement with human experts. The study, thus, derives topics from tweets about e-retail customer experience and thereby facilitates prioritization of decision-making pertaining to critical service encounter touchpoints.


2020 ◽  
Author(s):  
Junze Wang ◽  
Ying Zhou ◽  
Wei Zhang ◽  
Richard Evans ◽  
Chengyan Zhu

BACKGROUND The COVID-19 pandemic has created a global health crisis that is affecting economies and societies worldwide. During times of uncertainty and unexpected change, people have turned to social media platforms as communication tools and primary information sources. Platforms such as Twitter and Sina Weibo have allowed communities to share discussion and emotional support; they also play important roles for individuals, governments, and organizations in exchanging information and expressing opinions. However, research that studies the main concerns expressed by social media users during the pandemic is limited. OBJECTIVE The aim of this study was to examine the main concerns raised and discussed by citizens on Sina Weibo, the largest social media platform in China, during the COVID-19 pandemic. METHODS We used a web crawler tool and a set of predefined search terms (<i>New Coronavirus Pneumonia</i>, <i>New Coronavirus</i>, and <i>COVID-19</i>) to investigate concerns raised by Sina Weibo users. Textual information and metadata (number of likes, comments, retweets, publishing time, and publishing location) of microblog posts published between December 1, 2019, and July 32, 2020, were collected. After segmenting the words of the collected text, we used a topic modeling technique, latent Dirichlet allocation (LDA), to identify the most common topics posted by users. We analyzed the emotional tendencies of the topics, calculated the proportional distribution of the topics, performed user behavior analysis on the topics using data collected from the number of likes, comments, and retweets, and studied the changes in user concerns and differences in participation between citizens living in different regions of mainland China. RESULTS Based on the 203,191 eligible microblog posts collected, we identified 17 topics and grouped them into 8 themes. These topics were pandemic statistics, domestic epidemic, epidemics in other countries worldwide, COVID-19 treatments, medical resources, economic shock, quarantine and investigation, patients’ outcry for help, work and production resumption, psychological influence, joint prevention and control, material donation, epidemics in neighboring countries, vaccine development, fueling and saluting antiepidemic action, detection, and study resumption. The mean sentiment was positive for 11 topics and negative for 6 topics. The topic with the highest mean of retweets was domestic epidemic, while the topic with the highest mean of likes was quarantine and investigation. CONCLUSIONS Concerns expressed by social media users are highly correlated with the evolution of the global pandemic. During the COVID-19 pandemic, social media has provided a platform for Chinese government departments and organizations to better understand public concerns and demands. Similarly, social media has provided channels to disseminate information about epidemic prevention and has influenced public attitudes and behaviors. Government departments, especially those related to health, can create appropriate policies in a timely manner through monitoring social media platforms to guide public opinion and behavior during epidemics.


2021 ◽  
Author(s):  
Ankita Agarwal ◽  
William Romine ◽  
Tanvi Banerjee

<div>Understanding public outlook in healthcare management is important in the study of the various diseases. With respect to vaccinations, which play a major role in combating vaccine-preventable diseases, the study on their acceptance or rejection by the public becomes useful. In particular to the</div><div>influenza vaccine, studies on the public opinion and views is ongoing. Social media platforms like Twitter help us to leverage thoughts and attitudes related to the flu vaccine. The data set used for our analysis contained tweets related to vaccines which were collected using vaccine-related keywords over a period of twelve months from February, 2018 to January, 2019. Out of these tweets, we filtered out the tweets specific to the flu vaccine and generated our corpus for further study. By using Latent Dirichlet Allocation (LDA), we identified eighteen topics comprising six major themes which best represented our corpus. In this paper, we discuss these six themes and subsequently analyze the trend observed in these themes over a period of twelve months. The themes identified covered various aspects related to the flu vaccine. Among the six major themes, four showed a distinctive temporal trend with respect to the annual flu season.</div><div><br></div>


2020 ◽  
Author(s):  
Tasmiah Nuzhath ◽  
Samia Tasnim ◽  
Rahul Kumar Sanjwal ◽  
Nusrat Fahmida Trisha ◽  
Mariya Rahman ◽  
...  

Background: The coronavirus disease (COVID-19) pandemic has caused a significant burden of mortality and morbidity. A vaccine will be the most effective global preventive strategy to end the pandemic. Studies have maintained that exposure to negative sentiments related to vaccination on social media increase vaccine hesitancy and refusal. Despite the influence social media has on vaccination behavior, there is a lack of studies exploring the public's exposure to misinformation, conspiracy theories, and concerns on Twitter regarding a potential COVID-19 vaccination. Objective: The study aims to identify the major thematic areas about a potential COVID-19 vaccination based on the contents of Twitter data. Method: We retrieved 1,286,659 publicly available tweets posted within the timeline of July 19, 2020, to August 19, 2020, leveraging the Twint package. Following the extraction, we used Latent Dirichlet Allocation for topic modelling and identified 20 topics discussed in the tweets. We selected 4,868 tweets with the highest probability of belonging in the specific cluster and manually labeled as positive, negative, neutral, or irrelevant. The negative tweets were further assigned to a theme and subtheme based on the contentResult: The negative tweets were further categorized into 7 major themes: "safety and effectiveness,” "misinformation,” "conspiracy theories,” "mistrust of scientists and governments,” "lack of intent to get a COVID-19 vaccine,” "freedom of choice," and "religious beliefs. Negative tweets predominantly consisted of misleading statements (n=424) that immunization against coronavirus is unnecessary as the survival rate is high. The second most prevalent theme to emerge was tweets constituting safety and effectiveness related concerns (n=276) regarding the side effects of a potential vaccine developed at an unprecedented speed. Conclusion: Our findings suggest a need to formulate a large-scale vaccine communication plan that will address the safety concerns and debunk the misinformation and conspiracy theories spreading across social media platforms, increasing the public's acceptance of a COVID-19 vaccination.


Significance Millions of them have joined alternative social media platforms such as Parler and Telegram. Content alleging that the election was 'stolen' circulates freely on these platforms, together with conspiracy theories and anti-Semitic and white supremacist content. Impacts Conservative social media platforms will shift, not solve the misinformation problem. Conservatives migrating to their own platforms may help fulfil the claim that liberal voices dominate Facebook and Twitter. Legislation to rein in 'big tech' may exclude smaller platforms, leaving some that cater to the radical right unregulated.


2021 ◽  
Author(s):  
Iain Cruickshank ◽  
Tamar Ginossar ◽  
Jason Sulskis ◽  
Elena Zheleva ◽  
Tanya Berger-Wolf

BACKGROUND The onset of the COVID-19 pandemic and the consequent “infodemic” that ensued highlighted the role that social media play in increasing vaccine hesitancy. Despite the efforts to curtail the spread of misinformation, the anti-vaccination movement continues to use Twitter and other social media platforms to advance its messages. Although users typically engage with different social media platforms, research on vaccination discourse typically focused on single platforms. Understanding the content and dynamics of external content shared on vaccine-related conversations on Twitter during the COVID-19 pandemic can shed light on the use of different sources, including traditional media and social media by the anti-vaccination movement. In particular, examining how YouTube videos are shared within vaccination-related tweets is important in understanding the spread of anti-vaccination narratives. OBJECTIVE informed by agenda-setting theory, this study aimed to use machine-learning to understand the content and dynamics of external websites shared in vaccines-related tweets posted in COVID-19 conversations on Twitter. METHODS We screened around 5 million tweets posted to COVID-19 related conversations to include tweets that discussed vaccination. We then identified external content, including the most tweeted web domains and URLs within these tweets and the number of days they were shared. The topics and dynamics of tweeted YouTube videos were further analyzed by using Latent Dirichlet Allocation to topic-model the transcripts of the YouTube videos, and by independent coders. RESULTS of 841,896 vaccination-related tweets identified, 128,408 (22.1%) included external content. A wide range of external websites were shared. The 20 most tweeted websites constituted 10.9% of the shared websites and were typically shared for only 2-3 days within a one-month period. Traditional media constituted the majority of these 20 most tweeted URLs. Content of YouTube links shared had both the greatest number of unique URLs for any given URL domain and was the most tweeted domain over time. The majority (n=15) of the 20 most tweeted videos opposed vaccinations and featured conspiracy theories. Analysis of the transcripts of 1,280 YouTube videos shared indicated high frequency of conspiracy theories. CONCLUSIONS Our study reveals that sharing URLs over Twitter is a common communication strategy. Whereas shared URLs overall demonstrated a strong presence of legacy media organizations, YouTube videos were used to spread anti-vaccination messages. Produced by individuals or by foreign governments, these videos emerged as a major driver for sharing vaccine-related conspiracy theories. Future interventions should take into account cross-platform use to counteract this misinformation.


2020 ◽  
Author(s):  
Robert Robert ◽  
Pari Delir Haghighi ◽  
Frada Burstein ◽  
Donna Urquhart ◽  
Flavia Cicuttini

BACKGROUND Although personal experiences of low back pain have traditionally been explored through qualitative studies, social media content analysis has the potential to be used to complement these studies by providing deeper understanding of how problems such as pain are perceived by those how have it, and the effect of the contextual variables on individuals and the community. OBJECTIVE The objective of this study was to perform content analysis of tweets for identifying contextual variables of the low back pain (LBP) experience from a first-person perspective to better understand individuals’ beliefs and perceptions. METHODS We analysed 896,867 cleaned tweets about low back pain between 1 January 2014 – 31 December 2018. We tested and compared Latent Dirichlet Allocation (LDA), Dirichlet Multinomial Mixture (DMM), GPU-DMM, Biterm Topic Model (BTM) and Non-negative Matrix factorization (NMF) for identifying topics associated with tweets. A coherence score was determined to identify the best model. RESULTS LDA outperformed all other algorithms resulting in the highest coherence score. The best model was LDA with 60 topics with coherence score 0.562. With input from domain experts, the 60 topics were validated and grouped into 19 contextual categories. “Emotion and Beliefs” had the largest proportion of the total tweets (17.6%), followed by “Physical Activity” (13.85%) and “Daily Life” (9%), while “Food and Drink”, “Weather” and “Not Being Understood” had the least (1.29%, 1.13% and 1.02% respectively). Of the 11 topics within “emotions and beliefs”, 72% had negative sentiment. CONCLUSIONS Using social media allows access to the data from a larger, heterogonous and geographically distributed population which is not possible using traditional qualitative methods that are generally limited to a small population. Individuals may be more inclined to express their feelings and emotions freely on social media sites, where the data is collected in an unsolicited manner, compared to common, rigid data collection methods. A content analysis of tweets identified common themes in the area of low back pain that are consistent with findings from conventional qualitative studies but provide a more granular view of the individuals’ perspectives related to low back pain. This understanding has the potential to assist with developing more effective and personalized models of care to improve treatment outcomes.


2019 ◽  
Vol 11 (24) ◽  
pp. 7108
Author(s):  
Jun Shao ◽  
Qinlin Ying ◽  
Shujin Shu ◽  
Alastair M. Morrison ◽  
Elizabeth Booth

The tourist shopping experience is the sum of the satisfaction or dissatisfaction from the individual attributes of purchased products and services. With the popularity of the Internet and travel review websites, more people choose to upload their tour experiences on their favorite social media platforms, which can influence another’s travel planning and choices. However, there have been few investigations of social media reviews of tourist shopping experiences and especially of satisfaction with museum tourism shopping. This research analyzed the user-generated reviews of the National Gallery (NG) in London written in the English language on TripAdvisor to learn more about tourist shopping experience in museums. The Latent Dirichlet Allocation (LDA) topic model was used to discover the underlying themes of online reviews and keywords related to these shopping experiences. Sentiment analysis based on a purpose-developed dictionary was conducted to explore the dissatisfying aspects of tourist shopping experiences. The results provide a framework for museums to improve shopping experiences and enhance their future development.


2019 ◽  
Vol 33 (4) ◽  
pp. 1053-1075
Author(s):  
Vidushi Pandey ◽  
Sumeet Gupta ◽  
Manojit Chattopadhyay

Purpose The purpose of this paper is to explore how the use of social media by citizens has impacted the traditional conceptualization and operationalization of political participation in the society. Design/methodology/approach This study is based on Teorell et al.’s (2007) classification of political participation which is modified to suit the current context of social media. The authors classified 15,460 tweets along three parameters suggested in the framework with help of supervised text classification algorithms. Findings The analysis reveals that Activism is the most prominent form of political participation undertaken by people on Twitter. Other activities that were undertaken include Formal Political participation and Consumer participation. The analysis also reveals that identity of participant does not play a classifying role as expected from the theoretical framework. It was found that the social media as a platform facilitates new forms of participation which are not feasible offline. Research limitations/implications The current work considers only the microblogging platform of Twitter as the data source. For a more comprehensive insight, analysis of other social media platforms is also required. Originality/value To the best of the authors’ knowledge, this is one of the few analyses where such a large database covering multiple social media events has been created and analysed using supervised text classification algorithms. A large proportion of previous studies on social media have been based on case study and have limited analysis to only a particular event on social media. Although there exist a few works that have studied a vast and varied collection of social media data (Gaby and Caren, 2012; Shirazi, 2013; Rane and Salem, 2012), such efforts are few in number. This study aims to add to that stream of work where a wider and more generalized set of social media data is studied.


2021 ◽  
Author(s):  
Dominik Wawrzuta ◽  
Mariusz Jaworski ◽  
Joanna Gotlib ◽  
Mariusz Panczyk

BACKGROUND Despite the existence of an effective vaccine, measles still threatens the health and lives of many Europeans. Notably, during the COVID-19 pandemic, measles vaccine uptake declined; as a result, after the pandemic, European countries will have to increase vaccination rates to restore the extent of vaccination coverage among the population. Because information obtained from social media are one of the main causes of vaccine hesitancy, knowledge of the nature of information pertaining to measles that is shared on social media may help create educational campaigns. OBJECTIVE In this study, we aim to define the characteristics of European news about measles shared on social media platforms (ie, Facebook, Twitter, and Pinterest) from 2017 to 2019. METHODS We downloaded and translated (into English) 10,305 articles on measles published in European Union countries. Using latent Dirichlet allocation, we identified main topics and estimated the sentiments expressed in these articles. Furthermore, we used linear regression to determine factors related to the number of times a given article was shared on social media. RESULTS We found that, in most European social media posts, measles is only discussed in the context of local European events. Articles containing educational information and describing world outbreaks appeared less frequently. The most common emotions identified from the study’s news data set were fear and trust. Yet, it was found that readers were more likely to share information on educational topics and the situation in Germany, Ukraine, Italy, and Samoa. A high amount of anger, joy, and sadness expressed within the text was also associated with a higher number of shares. CONCLUSIONS We identified which features of news articles were related to increased social media shares. We found that social media users prefer sharing educational news to sharing informational news. Appropriate emotional content can also increase the willingness of social media users to share an article. Effective media content that promotes measles vaccinations should contain educational or scientific information, as well as specific emotions (such as anger, joy, or sadness). Articles with this type of content may offer the best chance of disseminating vital messages to a broad social media audience.


Sign in / Sign up

Export Citation Format

Share Document