Social Media Discussions Predict Mental Health Consultations on College Campuses

AbstractThe mental health of college students is a growing concern, and gauging the mental health needs of college students is difficult to assess in real-time and in scale. To address this gap, researchers and practitioners have encouraged the use of passive technologies. Social media is one such "passive sensor" that has shown potential as a viable "passive sensor" of mental health. However, the construct validity and in-practice reliability of computational assessments of mental health constructs with social media data remain largely unexplored. Towards this goal, we study how assessing the mental health of college students using social media data correspond with ground-truth data of on-campus mental health consultations. For a large U.S. public university, we obtained ground-truth data of on-campus mental health consultations between 2011–2016, and collected 66,000 posts from the university’s Reddit community. We adopted machine learning and natural language methodologies to measure symptomatic mental health expressions of depression, anxiety, stress, suicidal ideation, and psychosis on the social media data. Seasonal auto-regressive integrated moving average (SARIMA) models of forecasting on-campus mental health consultations showed that incorporating social media data led to predictions with r = 0.86 and SMAPE = 13.30, outperforming models without social media data by 41%. Our language analyses revealed that social media discussions during high mental health consultations months consisted of discussions on academics and career, whereas months of low mental health consultations saliently show expressions of positive affect, collective identity, and socialization. This study reveals that social media data can improve our understanding of college students’ mental health, particularly their mental health treatment needs.

Download Full-text

Mental Health Consultations on College Campuses: Examining the Predictive Ability of Social Media

10.21203/rs.3.rs-196605/v1 ◽

2021 ◽

Author(s):

Koustuv Saha ◽

Asra Yousuf ◽

Ryan L. Boyd ◽

James W. Pennebaker ◽

Munmun Choudhury

Keyword(s):

Mental Health ◽

College Students ◽

Social Media ◽

Collective Identity ◽

Ground Truth ◽

Treatment Needs ◽

Ground Truth Data ◽

Social Media Data ◽

Mental Health Consultations ◽

Media Data

Abstract The mental health of college students is a growing concern, and gauging the mental health needs of college students is difficult to assess in real-time and in scale. While social media has shown potential as a viable “passive sensor” of mental health, the construct validity and in-practice reliability of such computational assessments remain largely unexplored. Towards this goal, we study how assessing the mental health of college students using social media data correspond with ground-truth data of on-campus mental health consultations. For a large U.S. public university, we obtained ground-truth data of on-campus mental health consultations between 2011–2016, and collected 66,000 posts from the university’s Reddit community. We adopted machine learning and natural language methodologies to measure symptomatic mental health expressions of depression, anxiety, stress, suicidal ideation, and psychosis on the social media data. Seasonal auto-regressive integrated moving average (SARIMA) models of forecasting on-campus mental health consultations showed that incorporating social media data led to predictions with r=0.86 and SMAPE=13.30, outperforming models without social media data by 41%. Our language analyses revealed that social media discussions during high mental health consultations months consisted of discussions on academics and career, whereas months of low mental health consultations saliently show expressions of positive affect, collective identity, and socialization. This study reveals that social media data can improve our understanding of college students’ mental health, particularly their mental health treatment needs.

Download Full-text

Mental Health Consultations on College Campuses: Examining the Predictive Ability of Social Media

10.21203/rs.3.rs-162266/v1 ◽

2021 ◽

Author(s):

Koustuv Saha ◽

Asra Yousuf ◽

Ryan Boyd ◽

James Pennebaker ◽

Munmun De Choudhury

Keyword(s):

Mental Health ◽

College Students ◽

Social Media ◽

Collective Identity ◽

Ground Truth ◽

Treatment Needs ◽

Ground Truth Data ◽

Social Media Data ◽

Mental Health Consultations ◽

Media Data

Abstract The mental health of college students is a growing concern, and gauging the mental health needs of college students is difficult to assess in real-time and in scale. While social media has shown potential as a viable "passive sensor" of mental health, the construct validity and in-practice reliability of such computational assessments remain largely unexplored. Towards this goal, we study how assessing the mental health of college students using social media data correspond with ground-truth data of on-campus mental health consultations. For a large U.S. public university, we obtained ground-truth data of on-campus mental health consultations between 2011-2016, and collected 66,000 posts from the university's Reddit community. We adopted machine learning and natural language methodologies to measure symptomatic mental health expressions of depression, anxiety, stress, suicidal ideation, and psychosis on the social media data. Seasonal auto-regressive integrated moving average (SARIMA) models of forecasting on-campus mental health consultations showed that incorporating social media data led to predictions with r=0.86 and SMAPE=13.30, outperforming models without social media data by 41%. Our language analyses revealed that social media discussions during high mental health consultations months consisted of discussions on academics and career, whereas months of low mental health consultations saliently show expressions of positive affect, collective identity, and socialization. This study reveals that social media data can improve our understanding of college students' mental health, particularly their mental health treatment needs.

Download Full-text

Comparison of Social Media, Syndromic Surveillance, and Microbiologic Acute Respiratory Infection Data: Observational Study

JMIR Public Health and Surveillance ◽

10.2196/14986 ◽

2020 ◽

Vol 6 (2) ◽

pp. e14986 ◽

Cited By ~ 2

Author(s):

Ashlynn R Daughton ◽

Rumi Chunara ◽

Michael J Paul

Keyword(s):

Infectious Disease ◽

Social Media ◽

Random Sample ◽

Topic Model ◽

Ground Truth ◽

Ground Truth Data ◽

Social Media Data ◽

Individual Level ◽

Small Effect Size ◽

Media Data

Background Internet data can be used to improve infectious disease models. However, the representativeness and individual-level validity of internet-derived measures are largely unexplored as this requires ground truth data for study. Objective This study sought to identify relationships between Web-based behaviors and/or conversation topics and health status using a ground truth, survey-based dataset. Methods This study leveraged a unique dataset of self-reported surveys, microbiological laboratory tests, and social media data from the same individuals toward understanding the validity of individual-level constructs pertaining to influenza-like illness in social media data. Logistic regression models were used to identify illness in Twitter posts using user posting behaviors and topic model features extracted from users’ tweets. Results Of 396 original study participants, only 81 met the inclusion criteria for this study. Of these participants’ tweets, we identified only two instances that were related to health and occurred within 2 weeks (before or after) of a survey indicating symptoms. It was not possible to predict when participants reported symptoms using features derived from topic models (area under the curve [AUC]=0.51; P=.38), though it was possible using behavior features, albeit with a very small effect size (AUC=0.53; P≤.001). Individual symptoms were also generally not predictable either. The study sample and a random sample from Twitter are predictably different on held-out data (AUC=0.67; P≤.001), meaning that the content posted by people who participated in this study was predictably different from that posted by random Twitter users. Individuals in the random sample and the GoViral sample used Twitter with similar frequencies (similar @ mentions, number of tweets, and number of retweets; AUC=0.50; P=.19). Conclusions To our knowledge, this is the first instance of an attempt to use a ground truth dataset to validate infectious disease observations in social media data. The lack of signal, the lack of predictability among behaviors or topics, and the demonstrated volunteer bias in the study population are important findings for the large and growing body of disease surveillance using internet-sourced data.

Download Full-text

Assessing the mental health of college students by leveraging social media data

XRDS Crossroads The ACM Magazine for Students ◽

10.1145/3481834 ◽

2021 ◽

Vol 28 (1) ◽

pp. 54-58

Author(s):

Koustuv Saha ◽

Munmun De Choudhury

Keyword(s):

Mental Health ◽

College Students ◽

Social Media ◽

Young Adults ◽

Real Time ◽

Health Needs ◽

Mental Health Needs ◽

Social Media Data ◽

Use Of Social Media ◽

Media Data

The mental health of college students is a growing concern and gauging the mental health needs of this group is difficult to assess in real-time and in scale. The ubiquity and widespread use of social media, particularly among young adults, provides opportunities for various stakeholders to proactively assess the mental health of college students and provide timely and tailored support.

Download Full-text

Mining Social Media Data for Biomedical Signals and Health-Related Behavior

Annual Review of Biomedical Data Science ◽

10.1146/annurev-biodatasci-030320-040844 ◽

2020 ◽

Vol 3 (1) ◽

pp. 433-458 ◽

Cited By ~ 1

Author(s):

Rion Brattig Correia ◽

Ian B. Wood ◽

Johan Bollen ◽

Luis M. Rocha

Keyword(s):

Mental Health ◽

Social Media ◽

Population Level ◽

Data Access ◽

Health Conditions ◽

Social Phenomena ◽

Medical Treatments ◽

Social Media Data ◽

Health Related ◽

Media Data

Social media data have been increasingly used to study biomedical and health-related phenomena. From cohort-level discussions of a condition to population-level analyses of sentiment, social media have provided scientists with unprecedented amounts of data to study human behavior associated with a variety of health conditions and medical treatments. Here we review recent work in mining social media for biomedical, epidemiological, and social phenomena information relevant to the multilevel complexity of human health. We pay particular attention to topics where social media data analysis has shown the most progress, including pharmacovigilance and sentiment analysis, especially for mental health. We also discuss a variety of innovative uses of social media data for health-related applications as well as important limitations of social media data access and use.

Download Full-text

A Pipeline to Understand Emerging Illness Via Social Media Data Analysis: Case Study on Breast Implant Illness (Preprint)

10.2196/preprints.29768 ◽

2021 ◽

Author(s):

Vishal Dey ◽

Peter Krasniak ◽

Minh Nguyen ◽

Clara Lee ◽

Xia Ning

Keyword(s):

Mental Health ◽

Social Media ◽

Natural Language Processing ◽

Data Analysis ◽

Natural Language ◽

Language Processing ◽

Breast Implant ◽

Public Attention ◽

Social Media Data ◽

Media Data

BACKGROUND A new illness can come to public attention through social media before it is medically defined, formally documented, or systematically studied. One example is a condition known as breast implant illness (BII), which has been extensively discussed on social media, although it is vaguely defined in the medical literature. OBJECTIVE The objective of this study is to construct a data analysis pipeline to understand emerging illnesses using social media data and to apply the pipeline to understand the key attributes of BII. METHODS We constructed a pipeline of social media data analysis using natural language processing and topic modeling. Mentions related to signs, symptoms, diseases, disorders, and medical procedures were extracted from social media data using the clinical Text Analysis and Knowledge Extraction System. We mapped the mentions to standard medical concepts and then summarized these mapped concepts as topics using latent Dirichlet allocation. Finally, we applied this pipeline to understand BII from several BII-dedicated social media sites. RESULTS Our pipeline identified topics related to toxicity, cancer, and mental health issues that were highly associated with BII. Our pipeline also showed that cancers, autoimmune disorders, and mental health problems were emerging concerns associated with breast implants, based on social media discussions. Furthermore, the pipeline identified mentions such as rupture, infection, pain, and fatigue as common self-reported issues among the public, as well as concerns about toxicity from silicone implants. CONCLUSIONS Our study could inspire future studies on the suggested symptoms and factors of BII. Our study provides the first analysis and derived knowledge of BII from social media using natural language processing techniques and demonstrates the potential of using social media information to better understand similar emerging illnesses. CLINICALTRIAL

Download Full-text

Social Media Reveals Psychosocial Effects of the COVID-19 Pandemic

10.1101/2020.08.07.20170548 ◽

2020 ◽

Author(s):

Koustuv Saha ◽

John Torous ◽

Eric D. Caine ◽

Munmun De Choudhury

Keyword(s):

Mental Health ◽

Social Media ◽

Psychosocial Effects ◽

Self Disclosure ◽

Social Media Data ◽

Health Concerns ◽

Mental Health Concerns ◽

Precautionary Measures ◽

Media Data ◽

Over Time

AbstractBackgroundThe novel coronavirus disease 2019 (COVID-19) pandemic has caused several disruptions in personal and collective lives worldwide. The uncertainties surrounding the pandemic have also led to multi-faceted mental health concerns, which can be exacerbated with precautionary measures such as social distancing and self-quarantining, as well as societal impacts such as economic downturn and job loss. Despite noting this as a “mental health tsunami,” the psychological effects of the COVID-19 crisis remains unexplored at scale. Consequently, public health stakeholders are currently limited in identifying ways to provide timely and tailored support during these circumstances.ObjectiveOur work aims to provide insights regarding people’s psychosocial concerns during the COVID-19 pandemic by leveraging social media data. We aim to study the temporal and linguistic changes in symptomatic mental health and support expressions in the pandemic context.MethodsWe obtain ∼60M Twitter streaming posts originating from the U.S. from 24 March-24 May 2020, and compare these with ∼40M posts from a comparable period in 2019 to attribute the effect of COVID-19 on people’s social media self-disclosure. Using these datasets, we study people’s self-disclosure on social media in terms of symptomatic mental health concerns and expressions of support. We employ transfer learning classifiers that identify the social media language indicative of mental health outcomes (anxiety, depression, stress, and suicidal ideation) and support (emotional and informational support). We then examine the changes in psychosocial expressions over time and language, comparing the 2020 and 2019 datasets.ResultsWe find that all of the examined psychosocial expressions have significantly increased during the COVID-19 crisis – mental health symptomatic expressions have increased by ∼14%, and support expressions have increased by ∼5%, both thematically related to COVID-19. We also observe a steady decline and eventual plateauing in these expressions during the COVID-19 pandemic, which may have been due to habituation or due to supportive policy measures enacted during this period. Our language analyses highlight that people express concerns that are very specific to and contextually related to the COVID-19 crisis.ConclusionsWe studied the psychosocial effects of the COVID-19 crisis by using social media data from 2020, finding that people’s mental health symptomatic and support expressions significantly increased during the COVID-19 period as compared to similar data from 2019. However, this effect gradually lessened over time, suggesting that people adapted to the circumstances and their “new normal”. Our linguistic analyses revealed that people expressed mental health concerns regarding personal and professional challenges, healthcare and precautionary measures, and pandemic-related awareness. This work shows the potential to provide insights to mental healthcare and stakeholders and policymakers in planning and implementing measures to mitigate mental health risks amidst the health crisis.

Download Full-text

Studies of depression and anxiety using Reddit as a data source: Scoping review (Preprint)

10.2196/preprints.29487 ◽

2021 ◽

Author(s):

Nick Boettcher

Keyword(s):

Mental Health ◽

Social Media ◽

Scoping Review ◽

Primary Data ◽

Technical Solution ◽

Depression And Anxiety ◽

Social Media Data ◽

Primary Data Source ◽

Data Source ◽

Media Data

BACKGROUND The study of depression and anxiety using publicly available social media data is a research activity that has grown considerably over the last decade. The discussion platform Reddit has become a popular social media data source in this nascent area of study, in part because of the unique ways in which the platform is facilitative of research. To date, no work has been done to synthesize existing studies of depression and anxiety using Reddit. OBJECTIVE The objective of this review is to understand the scope and nature of research using Reddit as a primary data source for studying depression and anxiety. METHODS A scoping review was conducted using the Arksey and O’Malley framework. Academic databases searched include MEDLINE/PubMed, EMBASE, CINAHL, PsycINFO, PsycARTICLES, Scopus, ScienceDirect, IEEE Xplore, and ACM database. Inclusion criteria were developed using the Participants/Concept/Context framework outlined by the Joanna Briggs Institute Scoping Review Methodology Group. Eligible studies featured a methodological focus on analyzing depression and/or anxiety using naturalistic written expressions from Reddit users as the primary data source. RESULTS 54 Studies were included for review. Tables and corresponding analysis delineate key methodological features including a comparatively larger focus on depression versus anxiety, an even split of original and premade datasets, a favored analytic focus on classifying the mental health states of Reddit users, and practical implications often recommending new methods of professionally-driven mental health monitoring and outreach for Reddit users. CONCLUSIONS Studies of depression and anxiety using Reddit data are currently driven by a prevailing methodology which favors a technical, solution-based orientation. Researchers interested in advancing this research area will benefit from further consideration of conceptual issues surrounding interpretation of Reddit data with the medical model of mental health. Further efforts are also needed to locate accountability and autonomy within practice implications suggesting new forms of engagement with Reddit users.

Download Full-text

On the State of Social Media Data for Mental Health Research

10.18653/v1/2021.clpsych-1.2 ◽

2021 ◽

Author(s):

Keith Harrigian ◽

Carlos Aguirre ◽

Mark Dredze

Keyword(s):

Mental Health ◽

Social Media ◽

Health Research ◽

Mental Health Research ◽

The State ◽

Social Media Data ◽

Media Data

Download Full-text

Psychosocial Effects of the COVID-19 Pandemic: Large-scale Quasi-Experimental Study on Social Media (Preprint)

10.2196/preprints.22600 ◽

2020 ◽

Author(s):

Koustuv Saha ◽

John Torous ◽

Eric D Caine ◽

Munmun De Choudhury

Keyword(s):

Mental Health ◽

Social Media ◽

Data Sets ◽

Psychosocial Effects ◽

Self Disclosure ◽

Social Media Data ◽

Health Concerns ◽

Mental Health Concerns ◽

Precautionary Measures ◽

Media Data

BACKGROUND The COVID-19 pandemic has caused several disruptions in personal and collective lives worldwide. The uncertainties surrounding the pandemic have also led to multifaceted mental health concerns, which can be exacerbated with precautionary measures such as social distancing and self-quarantining, as well as societal impacts such as economic downturn and job loss. Despite noting this as a “mental health tsunami”, the psychological effects of the COVID-19 crisis remain unexplored at scale. Consequently, public health stakeholders are currently limited in identifying ways to provide timely and tailored support during these circumstances. OBJECTIVE Our study aims to provide insights regarding people’s psychosocial concerns during the COVID-19 pandemic by leveraging social media data. We aim to study the temporal and linguistic changes in symptomatic mental health and support expressions in the pandemic context. METHODS We obtained about 60 million Twitter streaming posts originating from the United States from March 24 to May 24, 2020, and compared these with about 40 million posts from a comparable period in 2019 to attribute the effect of COVID-19 on people’s social media self-disclosure. Using these data sets, we studied people’s self-disclosure on social media in terms of symptomatic mental health concerns and expressions of support. We employed transfer learning classifiers that identified the social media language indicative of mental health outcomes (anxiety, depression, stress, and suicidal ideation) and support (emotional and informational support). We then examined the changes in psychosocial expressions over time and language, comparing the 2020 and 2019 data sets. RESULTS We found that all of the examined psychosocial expressions have significantly increased during the COVID-19 crisis—mental health symptomatic expressions have increased by about 14%, and support expressions have increased by about 5%, both thematically related to COVID-19. We also observed a steady decline and eventual plateauing in these expressions during the COVID-19 pandemic, which may have been due to habituation or due to supportive policy measures enacted during this period. Our language analyses highlighted that people express concerns that are specific to and contextually related to the COVID-19 crisis. CONCLUSIONS We studied the psychosocial effects of the COVID-19 crisis by using social media data from 2020, finding that people’s mental health symptomatic and support expressions significantly increased during the COVID-19 period as compared to similar data from 2019. However, this effect gradually lessened over time, suggesting that people adapted to the circumstances and their “new normal.” Our linguistic analyses revealed that people expressed mental health concerns regarding personal and professional challenges, health care and precautionary measures, and pandemic-related awareness. This study shows the potential to provide insights to mental health care and stakeholders and policy makers in planning and implementing measures to mitigate mental health risks amid the health crisis.

Download Full-text