Monitoring global trends in Covid-19 vaccination intention and confidence: a social media-based deep learning study

AbstractBackgroundThis study developed deep learning models to monitor global intention and confidence of Covid-19 vaccination in real time.MethodsWe collected 6.73 million English tweets regarding Covid-19 vaccination globally from January 2020 to February 2021. Fine-tuned Transformer-based deep learning models were used to classify tweets in real time as they relate to Covid-19 vaccination intention and confidence. Temporal and spatial trends were performed to map the global prevalence of Covid-19 vaccination intention and confidence, and public engagement on social media was analyzed.FindingsGlobally, the proportion of tweets indicating intent to accept Covid-19 vaccination declined from 64.49% on March to 39.54% on September 2020, and then began to recover, reaching 52.56% in early 2021. This recovery in vaccine acceptance was largely driven by the US and European region, whereas other regions experienced the declining trends in 2020. Intent to accept and confidence of Covid-19 vaccination were relatively high in South-East Asia, Eastern Mediterranean, and Western Pacific regions, but low in American, European, and African regions. 12.71% tweets expressed misinformation or rumors in South Korea, 14.04% expressed distrust in government in the US, and 16.16% expressed Covid-19 vaccine being unsafe in Greece, ranking first globally. Negative tweets, especially misinformation or rumors, were more engaged by twitters with fewer followers than positive tweets.InterpretationThis global real-time surveillance study highlights the importance of deep learning based social media monitoring to detect emerging trends of Covid-19 vaccination intention and confidence to inform timely interventions.FundingNational Natural Science Foundation of China.Research in contextEvidence before this studyWith COVID-19 vaccine rollout, each country should investigate its vaccination intention in local contexts to ensure massive vaccination. We searched PubMed for all articles/preprints until April 9, 2021 with the keywords “(“Covid-19 vaccines”[Mesh] OR Covid-19 vaccin*[TI]) AND (confidence[TI] OR hesitancy[TI] OR acceptance[TI] OR intention[TI])”. We identified more than 100 studies, most of which are country-level cross-sectional surveys, and the largest global survey of Covid-19 vaccine acceptance only covered 32 countries to date. However, how Covid-19 vaccination intention changes over time remain unknown, and many countries are not covered in previous surveys yet. A few studies assessed public sentiments towards Covid-19 vaccination using social media data, but only targeting limited geographical areas. There is a lack of real-time surveillance, and no study to date has globally monitored Covid-19 vaccination intention in real time.Added value of this studyTo our knowledge, this is the largest global monitoring study of Covid-19 vaccination intention and confidence with social media data in over 100 countries from the beginning of the pandemic to February 2021. This study developed deep learning models by fine-tuning a Bidirectional Encoder Representation from Transformer (BERT)-based model with 8000 manually-classified tweets, which can be used to monitor Covid-19 vaccination beliefs using social media data in real time. It achieves temporal and spatial analyses of the evolving beliefs to Covid-19 vaccines across the world, and also an insight for many countries not yet covered in previous surveys. This study highlights that the intention to accept Covid-19 vaccination have experienced a declining trend since the beginning of the pandemic in all world regions, with some regions recovering recently, though not to their original levels. This recovery was largely driven by the US and European region (EUR), whereas other regions experienced the declining trends in 2020. Intention to accept and confidence of Covid-19 vaccination were relatively high in South-East Asia region (SEAR), Eastern Mediterranean region (EMR), and Western Pacific region (WPR), but low in American region (AMR), EUR, and African region (AFR). Many AFR countries worried more about vaccine effectiveness, while EUR, AMR, and WPR concerned more about vaccine safety (the most concerns with 16.16% in Greece). Online misinformation or rumors were widespread in AMR, EUR, and South Korea (12.71%, ranks first globally), and distrust in government was more prevalent in AMR (14.04% in the US, ranks first globally). Our findings can be used as a reference point for survey data on a single country in the future, and inform timely and specific interventions for each country to address Covid-19 vaccine hesitancy.Implications of all the available evidenceThis global real-time surveillance study highlights the importance of deep learning based social media monitoring as a quick and effective method for detecting emerging trends of Covid-19 vaccination intention and confidence to inform timely interventions, especially in settings with limited sources and urgent timelines. Future research should build multilingual deep learning models and monitor Covid-19 vaccination intention and confidence in real time with data from multiple social media platforms.

Download Full-text

Information extraction from digital social trace data with applications to social media and scholarly communication data

ACM SIGIR Forum ◽

10.1145/3451964.3451981 ◽

2020 ◽

Vol 54 (1) ◽

pp. 1-2

Author(s):

Shubhanshu Mishra

Keyword(s):

Social Media ◽

Information Extraction ◽

Scholarly Communication ◽

Structured Data ◽

Graph Structure ◽

Learning Models ◽

Social Media Data ◽

Scholarly Data ◽

Media Data ◽

Machine Learning Models

Information extraction (IE) aims at extracting structured data from unstructured or semi-structured data. The thesis starts by identifying social media data and scholarly communication data as a special case of digital social trace data (DSTD). This identification allows us to utilize the graph structure of the data (e.g., user connected to a tweet, author connected to a paper, author connected to authors, etc.) for developing new information extraction tasks. The thesis focuses on information extraction from DSTD, first, using only the text data from tweets and scholarly paper abstracts, and then using the full graph structure of Twitter and scholarly communications datasets. This thesis makes three major contributions. First, new IE tasks based on DSTD representation of the data are introduced. For scholarly communication data, methods are developed to identify article and author level novelty [Mishra and Torvik, 2016] and expertise. Furthermore, interfaces for examining the extracted information are introduced. A social communication temporal graph (SCTG) is introduced for comparing different communication data like tweets tagged with sentiment, tweets about a search query, and Facebook group posts. For social media, new text classification categories are introduced, with the aim of identifying enthusiastic and supportive users, via their tweets. Additionally, the correlation between sentiment classes and Twitter meta-data in public corpora is analyzed, leading to the development of a better model for sentiment classification [Mishra and Diesner, 2018]. Second, methods are introduced for extracting information from social media and scholarly data. For scholarly data, a semi-automatic method is introduced for the construction of a large-scale taxonomy of computer science concepts. The method relies on the Wikipedia category tree. The constructed taxonomy is used for identifying key computer science phrases in scholarly papers, and tracking their evolution over time. Similarly, for social media data, machine learning models based on human-in-the-loop learning [Mishra et al., 2015], semi-supervised learning [Mishra and Diesner, 2016], and multi-task learning [Mishra, 2019] are introduced for identifying sentiment, named entities, part of speech tags, phrase chunks, and super-sense tags. The machine learning models are developed with a focus on leveraging all available data. The multi-task models presented here result in competitive performance against other methods, for most of the tasks, while reducing inference time computational costs. Finally, this thesis has resulted in the creation of multiple open source tools and public data sets (see URL below), which can be utilized by the research community. The thesis aims to act as a bridge between research questions and techniques used in DSTD from different domains. The methods and tools presented here can help advance work in the areas of social media and scholarly data analysis.

Download Full-text

Disaster management 2.0: A real-time disaster damage assessment model based on mobile social media data—A case study of Weibo (Chinese Twitter)

Safety Science ◽

10.1016/j.ssci.2019.02.029 ◽

2019 ◽

Vol 115 ◽

pp. 393-413 ◽

Cited By ~ 22

Author(s):

Siqing Shan ◽

Feng Zhao ◽

Yigang Wei ◽

Mengni Liu

Keyword(s):

Social Media ◽

Real Time ◽

Disaster Management ◽

Damage Assessment ◽

Assessment Model ◽

Social Media Data ◽

Model Based ◽

Media Data ◽

Damage Assessment Model

Download Full-text

A Big Data Platform for Real Time Analysis of Signs of Depression in Social Media

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17134752 ◽

2020 ◽

Vol 17 (13) ◽

pp. 4752 ◽

Cited By ~ 1

Author(s):

Rodrigo Martínez-Castaño ◽

Juan C. Pichel ◽

David E. Losada

Keyword(s):

Social Media ◽

Real Time ◽

Public Health Surveillance ◽

Time Analysis ◽

Social Media Data ◽

Real Time Processing ◽

Processing Elements ◽

Real Time Analysis ◽

Data Platform ◽

Media Data

In this paper we propose a scalable platform for real-time processing of Social Media data. The platform ingests huge amounts of contents, such as Social Media posts or comments, and can support Public Health surveillance tasks. The processing and analytical needs of multiple screening tasks can easily be handled by incorporating user-defined execution graphs. The design is modular and supports different processing elements, such as crawlers to extract relevant contents or classifiers to categorise Social Media. We describe here an implementation of a use case built on the platform that monitors Social Media users and detects early signs of depression.

Download Full-text

Incremental Learning with Social Media Data to Predict Near Real-Time Events

Discovery Science - Lecture Notes in Computer Science ◽

10.1007/978-3-319-11812-3_16 ◽

2014 ◽

pp. 180-191

Author(s):

Duc Kinh Le Tran ◽

Cécile Bothorel ◽

Pascal Cheung Mon Chan ◽

Yvon Kermarrec

Keyword(s):

Social Media ◽

Real Time ◽

Incremental Learning ◽

Social Media Data ◽

Media Data

Download Full-text

The impact of social media on human interaction in an organisation based on real-time social media data

International Journal of Data Science ◽

10.1504/ijds.2019.102793 ◽

2019 ◽

Vol 4 (3) ◽

pp. 260

Author(s):

Sharifah Sakinah Syed Ahmad ◽

Anis Naseerah Binti Shaik Osman ◽

Halizah Basiron

Keyword(s):

Social Media ◽

Real Time ◽

Human Interaction ◽

Social Media Data ◽

The Impact ◽

Media Data

Download Full-text

A deep learning approach for detecting traffic accidents from social media data

Transportation Research Part C Emerging Technologies ◽

10.1016/j.trc.2017.11.027 ◽

2018 ◽

Vol 86 ◽

pp. 580-596 ◽

Cited By ~ 86

Author(s):

Zhenhua Zhang ◽

Qing He ◽

Jing Gao ◽

Ming Ni

Keyword(s):

Social Media ◽

Deep Learning ◽

Traffic Accidents ◽

Learning Approach ◽

Social Media Data ◽

Media Data

Download Full-text

Text based personality prediction from multiple social media data sources using pre-trained language model and model averaging

Journal Of Big Data ◽

10.1186/s40537-021-00459-1 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Hans Christian ◽

Derwin Suhartono ◽

Andry Chowanda ◽

Kamal Z. Zamli

Keyword(s):

Social Media ◽

Deep Learning ◽

Extraction Method ◽

Language Model ◽

Model Averaging ◽

Data Sources ◽

Online Information ◽

Social Media Data ◽

Personality Prediction ◽

Media Data

AbstractThe ever-increasing social media users has dramatically contributed to significant growth as far as the volume of online information is concerned. Often, the contents that these users put in social media can give valuable insights on their personalities (e.g., in terms of predicting job satisfaction, specific preferences, as well as the success of professional and romantic relationship) and getting it without the hassle of taking formal personality test. Termed personality prediction, the process involves extracting the digital content into features and mapping it according to a personality model. Owing to its simplicity and proven capability, a well-known personality model, called the big five personality traits, has often been adopted in the literature as the de facto standard for personality assessment. To date, there are many algorithms that can be used to extract embedded contextualized word from textual data for personality prediction system; some of them are based on ensembled model and deep learning. Although useful, existing algorithms such as RNN and LSTM suffers from the following limitations. Firstly, these algorithms take a long time to train the model owing to its sequential inputs. Secondly, these algorithms also lack the ability to capture the true (semantic) meaning of words; therefore, the context is slightly lost. To address these aforementioned limitations, this paper introduces a new prediction using multi model deep learning architecture combined with multiple pre-trained language model such as BERT, RoBERTa, and XLNet as features extraction method on social media data sources. Finally, the system takes the decision based on model averaging to make prediction. Unlike earlier work which adopts a single social media data with open and close vocabulary extraction method, the proposed work uses multiple social media data sources namely Facebook and Twitter and produce a predictive model for each trait using bidirectional context feature combine with extraction method. Our experience with the proposed work has been encouraging as it has outperformed similar existing works in the literature. More precisely, our results achieve a maximum accuracy of 86.2% and 0.912 f1 measure score on the Facebook dataset; 88.5% accuracy and 0.882 f1 measure score on the Twitter dataset.

Download Full-text