Assessing wild fire risk in the United States using social media data

Crime monitoring tools are needed for public health and law enforcement officials to deploy appropriate resources and develop targeted interventions. Social media, such as Twitter, has been shown to be a feasible tool for monitoring and predicting public health events such as disease outbreaks. Social media might also serve as a feasible tool for crime surveillance. In this study, we collected Twitter data between May and December 2012 and crime data for the years 2012 and 2013 in the United States. We examined the association between crime data and drug-related tweets. We found that tweets from 2012 were strongly associated with county-level crime data in both 2012 and 2013. This study presents preliminary evidence that social media data can be used to help predict future crimes. We discuss how future research can build upon this initial study to further examine the feasibility and effectiveness of this approach.

Download Full-text

Modeling Spatiotemporal Pattern of Depressive Symptoms Caused by COVID-19 Using Social Media Data Mining

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17144988 ◽

2020 ◽

Vol 17 (14) ◽

pp. 4988 ◽

Cited By ~ 3

Author(s):

Diya Li ◽

Harshita Chaudhary ◽

Zhe Zhang

Keyword(s):

Data Mining ◽

Social Media ◽

San Francisco ◽

Learning Algorithm ◽

The United States ◽

Spatiotemporal Pattern ◽

Social Media Data ◽

Clinical Patient ◽

Stress Symptoms ◽

Media Data

By 29 May 2020, the coronavirus disease (COVID-19) caused by SARS-CoV-2 had spread to 188 countries, infecting more than 5.9 million people, and causing 361,249 deaths. Governments issued travel restrictions, gatherings of institutions were cancelled, and citizens were ordered to socially distance themselves in an effort to limit the spread of the virus. Fear of being infected by the virus and panic over job losses and missed education opportunities have increased people’s stress levels. Psychological studies using traditional surveys are time-consuming and contain cognitive and sampling biases, and therefore cannot be used to build large datasets for a real-time depression analysis. In this article, we propose a CorExQ9 algorithm that integrates a Correlation Explanation (CorEx) learning algorithm and clinical Patient Health Questionnaire (PHQ) lexicon to detect COVID-19 related stress symptoms at a spatiotemporal scale in the United States. The proposed algorithm overcomes the common limitations of traditional topic detection models and minimizes the ambiguity that is caused by human interventions in social media data mining. The results show a strong correlation between stress symptoms and the number of increased COVID-19 cases for major U.S. cities such as Chicago, San Francisco, Seattle, New York, and Miami. The results also show that people’s risk perception is sensitive to the release of COVID-19 related public news and media messages. Between January and March, fear of infection and unpredictability of the virus caused widespread panic and people began stockpiling supplies, but later in April, concerns shifted as financial worries in western and eastern coastal areas of the U.S. left people uncertain of the long-term effects of COVID-19 on their lives.

Download Full-text

Next-generation visitation models using social media to estimate recreation on public lands

Scientific Reports ◽

10.1038/s41598-020-70829-x ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Spencer A. Wood ◽

Samantha G. Winder ◽

Emilia H. Lia ◽

Eric M. White ◽

Christian S. L. Crowley ◽

...

Keyword(s):

Social Media ◽

Public Lands ◽

The United States ◽

Multiple Sources ◽

Social Media Data ◽

Visitor Management ◽

Relative Value ◽

Promising Source ◽

Recreational Use ◽

Media Data

Abstract Outdoor and nature-based recreation provides countless social benefits, yet public land managers often lack information on the spatial and temporal extent of recreation activities. Social media is a promising source of data to fill information gaps because the amount of recreational use is positively correlated with social media activity. However, despite the implication that these correlations could be employed to accurately estimate visitation, there are no known transferable models parameterized for use with multiple social media data sources. This study tackles these issues by examining the relative value of multiple sources of social media in models that estimate visitation at unmonitored sites and times across multiple destinations. Using a novel dataset of over 30,000 social media posts and 286,000 observed visits from two regions in the United States, we compare multiple competing statistical models for estimating visitation. We find social media data substantially improve visitor estimates at unmonitored sites, even when a model is parameterized with data from another region. Visitation estimates are further improved when models are parameterized with on-site counts. These findings indicate that while social media do not fully substitute for on-site data, they are a powerful component of recreation research and visitor management.

Download Full-text

How scientists can take the lead in establishing ethical practices for social media research

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocy174 ◽

2019 ◽

Vol 26 (4) ◽

pp. 311-313 ◽

Cited By ~ 7

Author(s):

Sherry Pagoto ◽

Camille Nebeker

Keyword(s):

Social Media ◽

Scientific Community ◽

Human Subjects ◽

Research Misconduct ◽

The United States ◽

Ethical Practices ◽

Social Media Data ◽

Media Research ◽

Social Media Research ◽

Media Data

Abstract Social media use has become ubiquitous in the United States, providing unprecedented opportunities for research. However, the rapidly evolving research landscape has far outpaced federal regulations for the protection of human subjects. Recent highly publicized scandals have raised legitimate concerns in the media about how social media data are being used. These circumstances combined with the absence of ethical standards puts even the best intentioned scientists at risk of possible research misconduct. The scientific community may need to lead the charge in insuring the ethical use of social media data in scientific research. We propose 6 steps the scientific community can take to lead this charge. We underscore the important role of funding agencies and universities to create the necessary ethics infrastructure to allow social media research to flourish in a way that is pro-technology, pro-science, and most importantly, pro-humanity.

Download Full-text

Normalization Strategies for Enhancing Spatio-Temporal Analysis of Social Media Responses during Extreme Events: A Case Study based on Analysis of Four Extreme Events using Socio-Environmental Data Explorer (SEDE)

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iv-4-w2-139-2017 ◽

2017 ◽

Vol IV-4/W2 ◽

pp. 139-146

Author(s):

J. Ajayakumar ◽

E. Shook ◽

V. K. Turner

Keyword(s):

Social Media ◽

Extreme Events ◽

The United States ◽

Environmental Data ◽

Data Sources ◽

Social Media Data ◽

Public Response ◽

Spatio Temporal ◽

Media Data ◽

Media Responses

With social media becoming increasingly location-based, there has been a greater push from researchers across various domains including social science, public health, and disaster management, to tap in the spatial, temporal, and textual data available from these sources to analyze public response during extreme events such as an epidemic outbreak or a natural disaster. Studies based on demographics and other socio-economic factors suggests that social media data could be highly skewed based on the variations of population density with respect to place. To capture the spatio-temporal variations in public response during extreme events we have developed the Socio-Environmental Data Explorer (SEDE). SEDE collects and integrates social media, news and environmental data to support exploration and assessment of public response to extreme events. For this study, using SEDE, we conduct spatio-temporal social media response analysis on four major extreme events in the United States including the “North American storm complex” in December 2015, the “snowstorm Jonas” in January 2016, the “West Virginia floods” in June 2016, and the “Hurricane Matthew” in October 2016. Analysis is conducted on geo-tagged social media data from Twitter and warnings from the storm events database provided by National Centers For Environmental Information (NCEI) for analysis. Results demonstrate that, to support complex social media analyses, spatial and population-based normalization and filtering is necessary. The implications of these results suggests that, while developing software solutions to support analysis of non-conventional data sources such as social media, it is quintessential to identify the inherent biases associated with the data sources, and adapt techniques and enhance capabilities to mitigate the bias. The normalization strategies that we have developed and incorporated to SEDE will be helpful in reducing the population bias associated with social media data and will be useful for researchers and decision makers to enhance their analysis on spatio-temporal social media responses during extreme events.

Download Full-text

Investigation of Geographic and Macrolevel Variations in LGBTQ Patient Experiences: Longitudinal Social Media Analysis

Journal of Medical Internet Research ◽

10.2196/17087 ◽

2020 ◽

Vol 22 (7) ◽

pp. e17087

Author(s):

Yulin Hswen ◽

Amanda Zhang ◽

Kara C Sewalk ◽

Gaurav Tuli ◽

John S Brownstein ◽

...

Keyword(s):

Social Media ◽

Patient Experience ◽

State Level ◽

The United States ◽

Media Analysis ◽

Political Climate ◽

Social Media Data ◽

Political Identification ◽

Twitter Users ◽

Media Data

Background Discrimination in the health care system contributes to worse health outcomes among lesbian, gay, bisexual, transgender, and queer (LGBTQ) patients. Objective The aim of this study is to examine disparities in patient experience among LGBTQ persons using social media data. Methods We collected patient experience data from Twitter from February 2013 to February 2017 in the United States. We compared the sentiment of patient experience tweets between Twitter users who self-identified as LGBTQ and non-LGBTQ. The effect of state-level partisan identity on patient experience sentiment and differences between LGBTQ users and non-LGBTQ users were analyzed. Results We observed lower (more negative) patient experience sentiment among 13,689 LGBTQ users compared to 1,362,395 non-LGBTQ users. Increasing state-level liberal political identification was associated with higher patient experience sentiment among all users but had stronger effects for LGBTQ users. Conclusions Our findings highlight that social media data can yield insights about patient experience for LGBTQ persons and suggest that a state-level sociopolitical environment influences patient experience for this group. Efforts are needed to reduce disparities in patient care for LGBTQ persons while taking into context the effect of the political climate on these inequities.

Download Full-text

Policy Change and Public Opinion: Measuring Shifting Political Sentiment With Social Media Data

American Politics Research ◽

10.1177/1532673x20920263 ◽

2020 ◽

Vol 48 (5) ◽

pp. 612-621

Author(s):

Nicholas Joseph Adams-Cohen

Keyword(s):

Social Media ◽

Public Opinion ◽

Policy Change ◽

Gay Rights ◽

The United States ◽

Same Sex ◽

Data Set ◽

Social Media Data ◽

Same Sex Marriage ◽

Media Data

This article uses Twitter data and machine-learning methods to analyze the causal impact of the Supreme Court’s legalization of same-sex marriage at the federal level in the United States on political sentiment and discourse toward gay rights. In relying on social media text data, this project constructs a large data set of expressed political opinions in the short time frame before and after the Obergefell v. Hodges decision. Due to the variation in state laws regarding the legality of same-sex marriage prior to the Supreme Court’s decision, I use a difference-in-difference estimator to show that, in those states where the Court’s ruling produced a policy change, there was relatively more negative movement in public opinion toward same-sex marriage and gay rights issues as compared with other states. This confirms previous studies that show Supreme Court decisions polarize public opinion in the short term, extends previous results by demonstrating opinion becomes relatively more negative in states where policy is overturned, and demonstrates how to use social media data to engage in causal analyses.

Download Full-text

Online negative sentiment towards Mexicans and Hispanics and impact on mental well-being: A time-series analysis of social media data during the 2016 United States presidential election

Heliyon ◽

10.1016/j.heliyon.2020.e04910 ◽

2020 ◽

Vol 6 (9) ◽

pp. e04910

Author(s):

Yulin Hswen ◽

Qiuyuan Qin ◽

David R. Williams ◽

K. Viswanath ◽

S.V. Subramanian ◽

...

Keyword(s):

United States ◽

Social Media ◽

Time Series ◽

Time Series Analysis ◽

Presidential Election ◽

Well Being ◽

Social Media Data ◽

Negative Sentiment ◽

Mental Well Being ◽

Media Data

Download Full-text

Using Social Media to Investigate Geographic and Macro-Level Variations in LGBTQ Patient Experiences (Preprint)

10.2196/preprints.17087 ◽

2019 ◽

Author(s):

Yulin Hswen ◽

Amanda Zhang ◽

Kara Sewalk ◽

Gaurav Tuli ◽

John S Brownstein ◽

...

Keyword(s):

Social Media ◽

Patient Experience ◽

State Level ◽

The United States ◽

Political Environment ◽

Political Climate ◽

Social Media Data ◽

Political Identification ◽

Twitter Users ◽

Media Data

BACKGROUND Discrimination in the healthcare system contributes to worse health outcomes among lesbian, gay, bisexual, transgender and queer (LGBTQ) patients. OBJECTIVE To examine disparities in patient experience among LGBTQ persons using social media data. METHODS We collected patient experience data from Twitter from February 2013 to February 2017 in the United States. We compared sentiment of patient experience tweets between Twitter users who self-identified as LGBTQ and non-LGBTQ. The effect of state-level partisan identity on patient experience sentiment and differences between LGBTQ users and non-LGBTQ users were analyzed. RESULTS We observed lower patient experience sentiment among 13,689 LGBTQ users compared to 1,362,395 non-LGBTQ users. Increasing state-level liberal political identification was associated with higher patient experience sentiment among all users but had stronger effects for LGBTQ users. CONCLUSIONS Our findings highlight that social media data can yield insights about patient experience for LGBTQ persons and suggest that state-level socio-political environment influences patient experience for this group. Efforts are needed to reduce disparities in patient care for LGBTQ persons while taking into context the effect of political climate on these inequities. CLINICALTRIAL

Download Full-text

Next-generation Visitation Models using Social Media to Estimate Recreation on Public Lands

10.31235/osf.io/4wm97 ◽

2020 ◽

Author(s):

Spencer A Wood ◽

Samantha Winder ◽

Emilia Lia ◽

Eric White ◽

Christian Crowley ◽

...

Keyword(s):

Social Media ◽

Public Lands ◽

The United States ◽

Multiple Sources ◽

Social Media Data ◽

Visitor Management ◽

Relative Value ◽

Promising Source ◽

Recreational Use ◽

Media Data

Outdoor and nature-based recreation provides countless social benefits, yet public land managers often lack information on the spatial and temporal extent of recreation activities. Social media is a promising source of data to fill information gaps because the amount of recreational use is positively correlated with social media activity. However, despite the implication that these correlations could be employed to accurately estimate visitation, there are no known transferable models parameterized for use with multiple social media data sources. This study tackles these issues by examining the relative value of multiple sources of social media in models that estimate visitation at unmonitored sites and times across multiple destinations. Using a novel dataset of over 30,000 social media posts and 286,000 observed visits from two regions in the United States, we compare multiple competing statistical models for estimating visitation. We find social media data substantially improve visitor estimates at unmonitored sites, even when a model is parameterized with data from another region. Visitation estimates are further improved when models are parameterized with on-site counts. These findings indicate that while social media do not fully substitute for on-site data, they are a powerful component of recreation research and visitor management.

Download Full-text