twitter data
Recently Published Documents





2022 ◽  
Vol 18 (2) ◽  
pp. 1-27
Hang Cui ◽  
Tarek Abdelzaher

This article narrows the gap between physical sensing systems that measure physical signals and social sensing systems that measure information signals by (i) defining a novel algorithm for extracting information signals (building on results from text embedding) and (ii) showing that it increases the accuracy of truth discovery—the separation of true information from false/manipulated one. The work is applied in the context of separating true and false facts on social media, such as Twitter and Reddit, where users post predominantly short microblogs. The new algorithm decides how to aggregate the signal across words in the microblog for purposes of clustering the miscroblogs in the latent information signal space, where it is easier to separate true and false posts. Although previous literature extensively studied the problem of short text embedding/representation, this article improves previous work in three important respects: (1) Our work constitutes unsupervised truth discovery, requiring no labeled input or prior training. (2) We propose a new distance metric for efficient short text similarity estimation, we call Semantic Subset Matching , that improves our ability to meaningfully cluster microblog posts in the latent information signal space. (3) We introduce an iterative framework that jointly improves miscroblog clustering and truth discovery. The evaluation shows that the approach improves the accuracy of truth-discovery by 6.3%, 2.5%, and 3.8% (constituting a 38.9%, 14.2%, and 18.7% reduction in error, respectively) in three real Twitter data traces.

2022 ◽  
Vol 16 (1) ◽  
pp. 1-24
Marinos Poiitis ◽  
Athena Vakali ◽  
Nicolas Kourtellis

Aggression in online social networks has been studied mostly from the perspective of machine learning, which detects such behavior in a static context. However, the way aggression diffuses in the network has received little attention as it embeds modeling challenges. In fact, modeling how aggression propagates from one user to another is an important research topic, since it can enable effective aggression monitoring, especially in media platforms, which up to now apply simplistic user blocking techniques. In this article, we address aggression propagation modeling and minimization in Twitter, since it is a popular microblogging platform at which aggression had several onsets. We propose various methods building on two well-known diffusion models, Independent Cascade ( IC ) and Linear Threshold ( LT ), to study the aggression evolution in the social network. We experimentally investigate how well each method can model aggression propagation using real Twitter data, while varying parameters, such as seed users selection, graph edge weighting, users’ activation timing, and so on. It is found that the best performing strategies are the ones to select seed users with a degree-based approach, weigh user edges based on their social circles’ overlaps, and activate users according to their aggression levels. We further employ the best performing models to predict which ordinary real users could become aggressive (and vice versa) in the future, and achieve up to AUC = 0.89 in this prediction task. Finally, we investigate aggression minimization by launching competitive cascades to “inform” and “heal” aggressors. We show that IC and LT models can be used in aggression minimization, providing less intrusive alternatives to the blocking techniques currently employed by Twitter.

Neha Garg ◽  
Kamlesh Sharma

<span>Sentiment analysis (SA) is an enduring area for research especially in the field of text analysis. Text pre-processing is an important aspect to perform SA accurately. This paper presents a text processing model for SA, using natural language processing techniques for twitter data. The basic phases for machine learning are text collection, text cleaning, pre-processing, feature extractions in a text and then categorize the data according to the SA techniques. Keeping the focus on twitter data, the data is extracted in domain specific manner. In data cleaning phase, noisy data, missing data, punctuation, tags and emoticons have been considered. For pre-processing, tokenization is performed which is followed by stop word removal (SWR). The proposed article provides an insight of the techniques, that are used for text pre-processing, the impact of their presence on the dataset. The accuracy of classification techniques has been improved after applying text pre-processing and dimensionality has been reduced. The proposed corpus can be utilized in the area of market analysis, customer behaviour, polling analysis, and brand monitoring. The text pre-processing process can serve as the baseline to apply predictive analysis, machine learning and deep learning algorithms which can be extended according to problem definition.</span>

2022 ◽  
Amrita Mangaonkar ◽  
Rohit Pawar ◽  
Nahida Sultana Chowdhury ◽  
Rajeev R. Raje

Vaccines ◽  
2022 ◽  
Vol 10 (1) ◽  
pp. 103
Andrew T. Lian ◽  
Jingcheng Du ◽  
Lu Tang

Social media can be used to monitor the adverse effects of vaccines. The goal of this project is to develop a machine learning and natural language processing approach to identify COVID-19 vaccine adverse events (VAE) from Twitter data. Based on COVID-19 vaccine-related tweets (1 December 2020–1 August 2021), we built a machine learning-based pipeline to identify tweets containing personal experiences with COVID-19 vaccinations and to extract and normalize VAE-related entities, including dose(s); vaccine types (Pfizer, Moderna, and Johnson & Johnson); and symptom(s) from tweets. We further analyzed the extracted VAE data based on the location, time, and frequency. We found that the four most populous states (California, Texas, Florida, and New York) in the US witnessed the most VAE discussions on Twitter. The frequency of Twitter discussions of VAE coincided with the progress of the COVID-19 vaccinations. Sore to touch, fatigue, and headache are the three most common adverse effects of all three COVID-19 vaccines in the US. Our findings demonstrate the feasibility of using social media data to monitor VAEs. To the best of our knowledge, this is the first study to identify COVID-19 vaccine adverse event signals from social media. It can be an excellent supplement to the existing vaccine pharmacovigilance systems.

Sign in / Sign up

Export Citation Format

Share Document