Scraping social media data for disaster communication: how the pattern of Twitter users affects disasters in Asia and the Pacific

2020 ◽  
Vol 103 (3) ◽  
pp. 3415-3435
Author(s):  
Bevaola Kusumasari ◽  
Nias Phydra Aji Prabowo
Author(s):  
Juan M. Banda ◽  
Gurdas Viguruji Singh ◽  
Osaid Alser ◽  
DANIEL PRIETO-ALHAMBRA

As the COVID-19 virus continues to infect people across the globe, there is little understanding of the long term implications for recovered patients. There have been reports of persistent symptoms after confirmed infections on patients even after three months of initial recovery. While some of these patients have documented follow-ups on clinical records, or participate in longitudinal surveys, these datasets are usually not publicly available or standardized to perform longitudinal analyses on them. Therefore, there is a need to use additional data sources for continued follow-up and identification of latent symptoms that might be underreported in other places. In this work we present a preliminary characterization of post-COVID-19 symptoms using social media data from Twitter. We use a combination of natural language processing and clinician reviews to identify long term self-reported symptoms on a set of Twitter users.


2021 ◽  
Author(s):  
Michael Caballero

One major sub-domain in the subject of polling public opinion with social media data is electoral prediction. Electoral prediction utilizing social media data potentially would significantly affect campaign strategies, complementing traditional polling methods and providing cheaper polling in real-time. First, this paper explores past successful methods from research for analysis and prediction of the 2020 US Presidential Election using Twitter data. Then, this research proposes a new method for electoral prediction which combines sentiment, from NLP on the text of tweets, and structural data with aggregate polling, a time series analysis, and a special focus on Twitter users critical to the election. Though this method performed worse than its baseline of polling predictions, it is inconclusive whether this is an accurate method for predicting elections due to scarcity of data. More research and more data are needed to accurately measure this method’s overall effectiveness.


10.2196/17087 ◽  
2020 ◽  
Vol 22 (7) ◽  
pp. e17087
Author(s):  
Yulin Hswen ◽  
Amanda Zhang ◽  
Kara C Sewalk ◽  
Gaurav Tuli ◽  
John S Brownstein ◽  
...  

Background Discrimination in the health care system contributes to worse health outcomes among lesbian, gay, bisexual, transgender, and queer (LGBTQ) patients. Objective The aim of this study is to examine disparities in patient experience among LGBTQ persons using social media data. Methods We collected patient experience data from Twitter from February 2013 to February 2017 in the United States. We compared the sentiment of patient experience tweets between Twitter users who self-identified as LGBTQ and non-LGBTQ. The effect of state-level partisan identity on patient experience sentiment and differences between LGBTQ users and non-LGBTQ users were analyzed. Results We observed lower (more negative) patient experience sentiment among 13,689 LGBTQ users compared to 1,362,395 non-LGBTQ users. Increasing state-level liberal political identification was associated with higher patient experience sentiment among all users but had stronger effects for LGBTQ users. Conclusions Our findings highlight that social media data can yield insights about patient experience for LGBTQ persons and suggest that a state-level sociopolitical environment influences patient experience for this group. Efforts are needed to reduce disparities in patient care for LGBTQ persons while taking into context the effect of the political climate on these inequities.


2015 ◽  
Vol 5 (2) ◽  
pp. 90
Author(s):  
Mete Celik ◽  
Ahmet Sakir Dokuz

<p>Massive amount of data-related applications and widespread usage of web technologies has started big data era. Social media data is one of the big data sources. Mining social media data provides useful insights for companies and organizations for developing their services, products or organizations. This study aims to analyze Turkish Twitter users based on daily and hourly social media sharings. By this way, daily and hourly mood patterns of Turkish social media users could be revealed in positive or negative manner. For this purpose, Support Vector Machines (SVM) classification algorithm and Term Frequency – Inverse Document Frequency (TF-IDF) feature selection technique was used. As far as our knowledge, this is the first attempt to analyze people’s all sharings on social media and generate results for temporal-based indicators like macro and micro levels.</p><p> </p><p>Keywords: big data, social media, text classification, svm, tf-idf term weighting, daily and hourly mood patterns.</p>


2019 ◽  
Author(s):  
Yulin Hswen ◽  
Amanda Zhang ◽  
Kara Sewalk ◽  
Gaurav Tuli ◽  
John S Brownstein ◽  
...  

BACKGROUND Discrimination in the healthcare system contributes to worse health outcomes among lesbian, gay, bisexual, transgender and queer (LGBTQ) patients. OBJECTIVE To examine disparities in patient experience among LGBTQ persons using social media data. METHODS We collected patient experience data from Twitter from February 2013 to February 2017 in the United States. We compared sentiment of patient experience tweets between Twitter users who self-identified as LGBTQ and non-LGBTQ. The effect of state-level partisan identity on patient experience sentiment and differences between LGBTQ users and non-LGBTQ users were analyzed. RESULTS We observed lower patient experience sentiment among 13,689 LGBTQ users compared to 1,362,395 non-LGBTQ users. Increasing state-level liberal political identification was associated with higher patient experience sentiment among all users but had stronger effects for LGBTQ users. CONCLUSIONS Our findings highlight that social media data can yield insights about patient experience for LGBTQ persons and suggest that state-level socio-political environment influences patient experience for this group. Efforts are needed to reduce disparities in patient care for LGBTQ persons while taking into context the effect of political climate on these inequities. CLINICALTRIAL


2019 ◽  
Vol 11 (23) ◽  
pp. 6748 ◽  
Author(s):  
Minxuan Lan ◽  
Lin Liu ◽  
Andres Hernandez ◽  
Weiyi Liu ◽  
Hanlin Zhou ◽  
...  

As a measurement of the residential population, the Census population ignores the mobility of the people. This weakness may be alleviated by the use of ambient population, derived from social media data such as tweets. This research aims to examine the degree in which geotagged tweets, in contrast to the Census population, can explain crime. In addition, the mobility of Twitter users suggests that tweets as the ambient population may have a spillover effect on the neighboring areas. Based on a yearlong geotagged tweets dataset, negative binomial regression models are used to test the impact of tweets derived ambient population, as well as its possible spillover effect on theft crimes. Results show: (1) Tweets count is a viable replacement of the Census population for spatial theft pattern analysis; (2) tweets count as a measure of the ambient population shows a significant spillover effect on thefts, while such spillover effect does not exist for the Census population; (3) the combination of tweets and its spatial lag outperforms the Census population in theft crime analyses. Therefore, the spillover effect of the tweets derived ambient population should be considered in future crime analyses. This finding may be applicable to other social media data as well.


JAMIA Open ◽  
2021 ◽  
Vol 4 (2) ◽  
Author(s):  
Yuan-Chi Yang ◽  
Mohammed Ali Al-Garadi ◽  
Jennifer S Love ◽  
Jeanmarie Perrone ◽  
Abeed Sarker

Abstract Objective Biomedical research involving social media data is gradually moving from population-level to targeted, cohort-level data analysis. Though crucial for biomedical studies, social media user’s demographic information (eg, gender) is often not explicitly known from profiles. Here, we present an automatic gender classification system for social media and we illustrate how gender information can be incorporated into a social media-based health-related study. Materials and Methods We used a large Twitter dataset composed of public, gender-labeled users (Dataset-1) for training and evaluating the gender detection pipeline. We experimented with machine learning algorithms including support vector machines (SVMs) and deep-learning models, and public packages including M3. We considered users’ information including profile and tweets for classification. We also developed a meta-classifier ensemble that strategically uses the predicted scores from the classifiers. We then applied the best-performing pipeline to Twitter users who have self-reported nonmedical use of prescription medications (Dataset-2) to assess the system’s utility. Results and Discussion We collected 67 181 and 176 683 users for Dataset-1 and Dataset-2, respectively. A meta-classifier involving SVM and M3 performed the best (Dataset-1 accuracy: 94.4% [95% confidence interval: 94.0–94.8%]; Dataset-2: 94.4% [95% confidence interval: 92.0–96.6%]). Including automatically classified information in the analyses of Dataset-2 revealed gender-specific trends—proportions of females closely resemble data from the National Survey of Drug Use and Health 2018 (tranquilizers: 0.50 vs 0.50; stimulants: 0.50 vs 0.45), and the overdose Emergency Room Visit due to Opioids by Nationwide Emergency Department Sample (pain relievers: 0.38 vs 0.37). Conclusion Our publicly available, automated gender detection pipeline may aid cohort-specific social media data analyses (https://bitbucket.org/sarkerlab/gender-detection-for-public).


Author(s):  
Milad Mirbabaie ◽  
Christian Ehnis ◽  
Stefan Stieglitz ◽  
Deborah Bunker ◽  
Tanja Rose

AbstractSocial media has become an important channel of communication in emergency and disaster management. Emergency Management Agencies can distribute helpful and important information to the general public and also gather information to enrich their management efforts. This, however, remains challenging since several communication-related barriers occur. This study investigates how the concept of Nudging, a form of behaviour adjustment, can be applied to address these barriers. A Systematic Literature Review and qualitative social media data analysis methods were applied to explore the potential of digital nudges on social media. Twelve forms of digital nudges could be identified in the data that influenced the visibility of the messages they occurred in. The results suggest that Digital Nudging on Social Media is a promising approach to use in emergency and disaster communication.


2021 ◽  
Author(s):  
Maya Stemmer ◽  
Yisrael Parmet ◽  
Gilad Ravid

BACKGROUND Social media serve as an alternate information source for patients, who use them to share information and provide social support. Though large amounts of health-related data are being posted on Twitter and other social networking platforms each day, research using social media data for understanding chronic conditions and patients' lifestyles is still lacking. OBJECTIVE In this research we contribute to closing this gap by providing a framework for identifying patients with Inflammatory Bowel Disease (IBD) on Twitter and learning from their personal experience. We enable the analysis of patients' tweets by building a classifier of Twitter users that distinguishes patients from other entities. The research aims to assess the feasibility of using social media data to promote chronically ill patients' wellbeing, by relying on the wisdom of the crowd for identifying healthy lifestyles. We seek to leverage posts describing patients' daily activities and the influence on their wellbeing for characterizing different treatments and understanding what works for whom. METHODS In the first stage of the research, a machine learning method combining both social network analysis and natural language processing was used to classify users as patients or not automatically. Three types of features were considered: (1) the user's behavior on Twitter, (2) the content of the user's tweets, and (3) the social structure of the user's network. Different classification algorithms were examined and compared using two measures (F1-score and precision) over 10-fold cross-validation. In the second stage of the research, the obtained classification methods were used to collect tweets of patients, in which they refer to the different lifestyle changes they endure in order to deal with their disease. Using IBM Watson Service for entity sentiment analysis, we calculated the average sentiment of 420 lifestyle-related words that IBD patients use when describing their daily routine. RESULTS The best classification results (F1-score 0.808 and precision 0.809) for identifying IBD patients among Twitter users were achieved by a multiple-instance learning approach, which constitutes the novelty of this research. The sentiment analysis of tweets written by IBD patients identified frequently mentioned lifestyles and their influence on patients' wellbeing. The findings reinforced what is known about suitable nutrition for IBD, and several foods that are known to cause inflammation were highlighted as words with negative sentiment. CONCLUSIONS Patients everywhere use social media to share health and treatment information, learn from each other's experiences, and provide social support. Mining these informative conversations may shed some light on patients' ways of life and support chronic conditions research.


2019 ◽  
Vol 1 (1) ◽  
pp. 257-263
Author(s):  
Alexandru-Răzvan Florea

Abstract Online Social Networks have become a significant part of our quotidian life. In this paper, we aim to provide a proof of concept of how social media data can be effectively extracted, processed and analyzed with powerful open source tools like R. Moreover, we aim to build a reliable methodology for testing and validating social trends by using social media data. We used API routines to establish the connection between R and Twitter, Deep Learning Models to estimate the demographics of the users, Logistic Regression Models to estimate the predispositions of the users, and Propensity Score Matching to build comparable data sets. After analyzing the Romanian Twitter users, the results of our inquiry show that most of them are relatively young and the percentage of males is significantly higher than the percentage of females. Moreover, our results confirm that facial appearances play an essential role in the popularity of an individual.


Sign in / Sign up

Export Citation Format

Share Document