Data-driven inferences of agency-level risk and response communication on COVID-19 through social media-based interactions

2021 ◽  
Vol 19 (7) ◽  
pp. 59-82
Author(s):  
Md Ashraf Ahmed, PhD Candidate ◽  
Arif Mohaimin Sadri, PhD ◽  
M. Hadi Amini, PhD, DEng

Risk perception and risk averting behaviors of public agencies in the emergence and spread of COVID-19 can be retrieved through online social media (Twitter), and such interactions can be echoed in other information outlets. This study collected time-sensitive online social media data and analyzed patterns of health risk communication of public health and emergency agencies in the emergence and spread of novel coronavirus using data-driven methods. The major focus is toward understanding how policy-making agencies communicate risk and response information through social media during a pandemic and influence community response—ie, timing of lockdown, timing of reopening, etc.—and disease outbreak indicators—ie, number of confirmed cases and number of deaths. Twitter data of six major public organizations (1,000-4,500 tweets per organization) are collected from February 21, 2020 to June 6, 2020. Several machine learning algorithms, including dynamic topic model and sentiment analysis, are applied over time to identify the topic dynamics over the specific timeline of the pandemic. Organizations emphasized on various topics—eg, importance of wearing face mask, home quarantine, understanding the symptoms, social distancing and contact tracing, emerging community transmission, lack of personal protective equipment, COVID-19 testing and medical supplies, effect of tobacco, pandemic stress management, increasing hospitalization rate, upcoming hurricane season, use of convalescent plasma for COVID-19 treatment, maintaining hygiene, and the role of healthcare podcast in different timeline. The findings can benefit emergency management, policymakers, and public health agencies to identify targeted information dissemination policies for public with diverse needs based on how local, federal, and international agencies reacted to COVID-19.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sakthi Kumar Arul Prakash ◽  
Conrad Tucker

AbstractThis work investigates the ability to classify misinformation in online social media networks in a manner that avoids the need for ground truth labels. Rather than approach the classification problem as a task for humans or machine learning algorithms, this work leverages user–user and user–media (i.e.,media likes) interactions to infer the type of information (fake vs. authentic) being spread, without needing to know the actual details of the information itself. To study the inception and evolution of user–user and user–media interactions over time, we create an experimental platform that mimics the functionality of real-world social media networks. We develop a graphical model that considers the evolution of this network topology to model the uncertainty (entropy) propagation when fake and authentic media disseminates across the network. The creation of a real-world social media network enables a wide range of hypotheses to be tested pertaining to users, their interactions with other users, and with media content. The discovery that the entropy of user–user and user–media interactions approximate fake and authentic media likes, enables us to classify fake media in an unsupervised learning manner.


Sexual Health ◽  
2015 ◽  
Vol 12 (2) ◽  
pp. 170 ◽  
Author(s):  
Edward Coughlan ◽  
Heather Young ◽  
Catherine Parkes ◽  
Maureen Coshall ◽  
Nigel Dickson ◽  
...  

During 2012, Christchurch experienced a dramatic increase in cases of infectious syphilis among men who have sex with men. This was accompanied by some novel trends; notably, the acquisition of infection in a younger age group, with local sexual contacts, commonly via the use of social media. This study is a report on an approach to case identification and public health communication as a component of a multifaceted outbreak response. Enhanced syphilis surveillance data on public health responses to outbreaks of sexually transmissible infections was collated and reviewed, alongside clinical records and literature. Reported outbreak response methods were adapted for the Christchurch cohort. A Facebook page was created to raise awareness of infectious syphilis, the importance of screening and where to get tested. Twenty-six males were diagnosed with infectious syphilis in 2012, an increase from previous years, of which 22 reported only male sexual contact. High use of social media used to find potential sexual contacts was reported. Enhanced syphilis surveillance characterised in detail an infectious syphilis outbreak in Christchurch. Index cases were identified, contact tracing mapping was used to identify transmission networks and social media was also used to educate the risk group. There was a decrease in infectious syphilis presentations, with no cases in the last 3 months of 2012.


2020 ◽  
Vol 17 (7) ◽  
pp. 2869-2875
Author(s):  
Sajay Thomas Samuel ◽  
Booma Poolan Marikannan

Machine learning can help people to perform complex tasks and solve problems as it uses historical data to learn its pattern and make predictions based on the past data. This research addresses the problem about movie reviews on social media specifically Twitter; where it will gather the tweets on movie reviews and display a rating based on the sentiment of the tweet. Twitter is an online social media website where people from all walks of life communicate by tweeting short updates without exceeding the character limit which is 240 characters. Twitter is continuously growing as a business and became one of the biggest platform for communication and instant messaging. Due to the large number of users, there are voluminous amounts of data available that can be used for more in depth information and insights and to get the sentiments from analysing the tweets. In today’s world, there are many applications that are using sentiment analysis in various fields such as to gets insights about a particular brand or product. To do sentiment analysis using the traditional ways can be time consuming and becomes very complex. The aim of this research is to investigate about the domain of sentiment analysis and incorporate a machine learning algorithm to create a system that is able to get and display the ratings of a particular movie. The machine learning algorithms used are Naïve Bayes Classifier and SVM. The algorithm with better accuracy will be chosen for the implementation phase.


2020 ◽  
Author(s):  
Robert Chew ◽  
Caroline Kery ◽  
Laura Baum ◽  
Thomas Bukowski ◽  
Annice Kim ◽  
...  

BACKGROUND Social media are important for monitoring perceptions of public health issues and for educating target audiences about health; however, limited information about the demographics of social media users makes it challenging to identify conversations among target audiences and limits how well social media can be used for public health surveillance and education outreach efforts. Certain social media platforms provide demographic information on followers of a user account, if given, but they are not always disclosed, and researchers have developed machine learning algorithms to predict social media users’ demographic characteristics, mainly for Twitter. To date, there has been limited research on predicting the demographic characteristics of Reddit users. OBJECTIVE We aimed to develop a machine learning algorithm that predicts the age segment of Reddit users, as either adolescents or adults, based on publicly available data. METHODS This study was conducted between January and September 2020 using publicly available Reddit posts as input data. We manually labeled Reddit users’ age by identifying and reviewing public posts in which Reddit users self-reported their age. We then collected sample posts, comments, and metadata for the labeled user accounts and created variables to capture linguistic patterns, posting behavior, and account details that would distinguish the adolescent age group (aged 13 to 20 years) from the adult age group (aged 21 to 54 years). We split the data into training (n=1660) and test sets (n=415) and performed 5-fold cross validation on the training set to select hyperparameters and perform feature selection. We ran multiple classification algorithms and tested the performance of the models (precision, recall, F1 score) in predicting the age segments of the users in the labeled data. To evaluate associations between each feature and the outcome, we calculated means and confidence intervals and compared the two age groups, with 2-sample t tests, for each transformed model feature. RESULTS The gradient boosted trees classifier performed the best, with an F1 score of 0.78. The test set precision and recall scores were 0.79 and 0.89, respectively, for the adolescent group (n=254) and 0.78 and 0.63, respectively, for the adult group (n=161). The most important feature in the model was the number of sentences per comment (permutation score: mean 0.100, SD 0.004). Members of the adolescent age group tended to have created accounts more recently, have higher proportions of submissions and comments in the r/teenagers subreddit, and post more in subreddits with higher subscriber counts than those in the adult group. CONCLUSIONS We created a Reddit age prediction algorithm with competitive accuracy using publicly available data, suggesting machine learning methods can help public health agencies identify age-related target audiences on Reddit. Our results also suggest that there are characteristics of Reddit users’ posting behavior, linguistic patterns, and account features that distinguish adolescents from adults.


Entropy ◽  
2021 ◽  
Vol 23 (12) ◽  
pp. 1642
Author(s):  
James Flamino ◽  
Bowen Gong ◽  
Frederick Buchanan ◽  
Boleslaw K. Szymanski

Online social media provides massive open-ended platforms for users of a wide variety of backgrounds, interests, and beliefs to interact and debate, facilitating countless discussions across a myriad of subjects. With numerous unique voices being lent to the ever-growing information stream, it is essential to consider how the types of conversations that result from a social media post represent the post itself. We hypothesize that the biases and predispositions of users cause them to react to different topics in different ways not necessarily entirely intended by the sender. In this paper, we introduce a set of unique features that capture patterns of discourse, allowing us to empirically explore the relationship between a topic and the conversations it induces. Utilizing “microscopic” trends to describe “macroscopic” phenomena, we set a paradigm for analyzing information dissemination through the user reactions that arise from a topic, eliminating the need to analyze the involved text of the discussions. Using a Reddit dataset, we find that our features not only enable classifiers to accurately distinguish between content genre, but also can identify more subtle semantic differences in content under a single topic as well as isolating outliers whose subject matter is substantially different from the norm.


2018 ◽  
Author(s):  
Sachin Muralidhara ◽  
Michael J. Paul

BACKGROUND Social media provides a complementary source of information for public health surveillance. The dominate data source for this type of monitoring is the microblogging platform Twitter, which is convenient due to the free availability of public data. Less is known about the utility of other social media platforms, despite their popularity. OBJECTIVE This work aims to characterize the health topics that are prominently discussed in the image-sharing platform Instagram, as a step toward understanding how this data might be used for public health research. METHODS The study uses a topic modeling approach to discover topics in a dataset of 96,426 Instagram posts containing hashtags related to health. We use a polylingual topic model, initially developed for datasets in different natural languages, to model different modalities of data: hashtags, caption words, and image tags automatically extracted using a computer vision tool. RESULTS We identified 47 health-related topics in the data (kappa=.77), covering ten broad categories: acute illness, alternative medicine, chronic illness and pain, diet, exercise, health care & medicine, mental health, musculoskeletal health and dermatology, sleep, and substance use. The most prevalent topics were related to diet (8,293/96,426; 8.6% of posts) and exercise (7,328/96,426; 7.6% of posts). CONCLUSIONS A large and diverse set of health topics are discussed in Instagram. The extracted image tags were generally too coarse and noisy to be used for identifying posts but were in some cases accurate for identifying images relevant to studying diet and substance use. Instagram shows potential as a source of public health information, though limitations in data collection and metadata availability may limit its use in comparison to platforms like Twitter.


2015 ◽  
Vol 2 (2) ◽  
pp. e17 ◽  
Author(s):  
Li Guan ◽  
Bibo Hao ◽  
Qijin Cheng ◽  
Paul SF Yip ◽  
Tingshao Zhu

Background Traditional offline assessment of suicide probability is time consuming and difficult in convincing at-risk individuals to participate. Identifying individuals with high suicide probability through online social media has an advantage in its efficiency and potential to reach out to hidden individuals, yet little research has been focused on this specific field. Objective The objective of this study was to apply two classification models, Simple Logistic Regression (SLR) and Random Forest (RF), to examine the feasibility and effectiveness of identifying high suicide possibility microblog users in China through profile and linguistic features extracted from Internet-based data. Methods There were nine hundred and nine Chinese microblog users that completed an Internet survey, and those scoring one SD above the mean of the total Suicide Probability Scale (SPS) score, as well as one SD above the mean in each of the four subscale scores in the participant sample were labeled as high-risk individuals, respectively. Profile and linguistic features were fed into two machine learning algorithms (SLR and RF) to train the model that aims to identify high-risk individuals in general suicide probability and in its four dimensions. Models were trained and then tested by 5-fold cross validation; in which both training set and test set were generated under the stratified random sampling rule from the whole sample. There were three classic performance metrics (Precision, Recall, F1 measure) and a specifically defined metric “Screening Efficiency” that were adopted to evaluate model effectiveness. Results Classification performance was generally matched between SLR and RF. Given the best performance of the classification models, we were able to retrieve over 70% of the labeled high-risk individuals in overall suicide probability as well as in the four dimensions. Screening Efficiency of most models varied from 1/4 to 1/2. Precision of the models was generally below 30%. Conclusions Individuals in China with high suicide probability are recognizable by profile and text-based information from microblogs. Although there is still much space to improve the performance of classification models in the future, this study may shed light on preliminary screening of risky individuals via machine learning algorithms, which can work side-by-side with expert scrutiny to increase efficiency in large-scale-surveillance of suicide probability from online social media.


2019 ◽  
Author(s):  
Joana M Barros ◽  
Jim Duggan ◽  
Dietrich Rebholz-Schuhmann

BACKGROUND Public health surveillance is based on the continuous and systematic collection, analysis, and interpretation of data. This informs the development of early warning systems to monitor epidemics and documents the impact of intervention measures. The introduction of digital data sources, and specifically sources available on the internet, has impacted the field of public health surveillance. New opportunities enabled by the underlying availability and scale of internet-based sources (IBSs) have paved the way for novel approaches for disease surveillance, exploration of health communities, and the study of epidemic dynamics. This field and approach is also known as infodemiology or infoveillance. OBJECTIVE This review aimed to assess research findings regarding the application of IBSs for public health surveillance (infodemiology or infoveillance). To achieve this, we have presented a comprehensive systematic literature review with a focus on these sources and their limitations, the diseases targeted, and commonly applied methods. METHODS A systematic literature review was conducted targeting publications between 2012 and 2018 that leveraged IBSs for public health surveillance, outbreak forecasting, disease characterization, diagnosis prediction, content analysis, and health-topic identification. The search results were filtered according to previously defined inclusion and exclusion criteria. RESULTS Spanning a total of 162 publications, we determined infectious diseases to be the preferred case study (108/162, 66.7%). Of the eight categories of IBSs (search queries, social media, news, discussion forums, websites, web encyclopedia, and online obituaries), search queries and social media were applied in 95.1% (154/162) of the reviewed publications. We also identified limitations in representativeness and biased user age groups, as well as high susceptibility to media events by search queries, social media, and web encyclopedias. CONCLUSIONS IBSs are a valuable proxy to study illnesses affecting the general population; however, it is important to characterize which diseases are best suited for the available sources; the literature shows that the level of engagement among online platforms can be a potential indicator. There is a necessity to understand the population’s online behavior; in addition, the exploration of health information dissemination and its content is significantly unexplored. With this information, we can understand how the population communicates about illnesses online and, in the process, benefit public health.


2020 ◽  
Vol 26 (2) ◽  
pp. 53-71
Author(s):  
Lidwina Mutia Sadasri

Information dissemination in the media, specifically social media, is one of the critical channels of information related to the COVID-19 outbreak sought by the public. The information presented has been related to accurate and reliable situation reports and false information in various forms, not only text-based but also audio and visual. The chaos of data, coupled with a central response that seemed unprepared, shaped the Indonesian community’s perceptions of the COVID-19 outbreak. This fact related to the massive number of internet users in Indonesia is one aspect of the government’s decision, in this case BNPB (Badan Nasional Penanggulangan Bencana; officially National Disaster Management Authority), to engage strong social media influencers. The government collaborated with some influencers to enable public engagement through online social media platforms in the context of COVID-19—two of them being @dr.tirta and @rachelvennya. The platforms also gained more visibility after being appointed COVID-19 influencers. They updated information about COVID-19 on their social media accounts with picture posts and Instagram stories, either individually or in collaboration with others. This study aims to analyse the practice of the Indonesian government’s agency using micro-celebrity to deploy a risk communication frame and the delivery of the message by a celebrated person.


Author(s):  
Jean B. Nachega ◽  
Rhoda Atteh ◽  
Chikwe Ihekweazu ◽  
Nadia A. Sam-Agudu ◽  
Prisca Adejumo ◽  
...  

Most African countries have recorded relatively lower COVID-19 burdens than Western countries. This has been attributed to early and strong political commitment and robust implementation of public health measures, such as nationwide lockdowns, travel restrictions, face mask wearing, testing, contact tracing, and isolation, along with community education and engagement. Other factors include the younger population age strata and hypothesized but yet-to-be confirmed partially protective cross-immunity from parasitic diseases and/or other circulating coronaviruses. However, the true burden may also be underestimated due to operational and resource issues for COVID-19 case identification and reporting. In this perspective article, we discuss selected best practices and challenges with COVID-19 contact tracing in Nigeria, Rwanda, South Africa, and Uganda. Best practices from these country case studies include sustained, multi-platform public communications; leveraging of technology innovations; applied public health expertise; deployment of community health workers; and robust community engagement. Challenges include an overwhelming workload of contact tracing and case detection for healthcare workers, misinformation and stigma, and poorly sustained adherence to isolation and quarantine. Important lessons learned include the need for decentralization of contact tracing to the lowest geographic levels of surveillance, rigorous use of data and technology to improve decision-making, and sustainment of both community sensitization and political commitment. Further research is needed to understand the role and importance of contact tracing in controlling community transmission dynamics in African countries, including among children. Also, implementation science will be critically needed to evaluate innovative, accessible, and cost-effective digital solutions to accommodate the contact tracing workload.


Sign in / Sign up

Export Citation Format

Share Document