Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis (Preprint)

BACKGROUND Depression is a serious personal and public mental health problem. Self-reporting is the main method used to diagnose depression and to determine the severity of depression. However, it is not easy to discover patients with depression owing to feelings of shame in disclosing or discussing their mental health conditions with others. Moreover, self-reporting is time-consuming, and usually leads to missing a certain number of cases. Therefore, automatic discovery of patients with depression from other sources such as social media has been attracting increasing attention. Social media, as one of the most important daily communication systems, connects large quantities of people, including individuals with depression, and provides a channel to discover patients with depression. In this study, we investigated deep-learning methods for depression risk prediction using data from Chinese microblogs, which have potential to discover more patients with depression and to trace their mental health conditions. OBJECTIVE The aim of this study was to explore the potential of state-of-the-art deep-learning methods on depression risk prediction from Chinese microblogs. METHODS Deep-learning methods with pretrained language representation models, including bidirectional encoder representations from transformers (BERT), robustly optimized BERT pretraining approach (RoBERTa), and generalized autoregressive pretraining for language understanding (XLNET), were investigated for depression risk prediction, and were compared with previous methods on a manually annotated benchmark dataset. Depression risk was assessed at four levels from 0 to 3, where 0, 1, 2, and 3 denote no inclination, and mild, moderate, and severe depression risk, respectively. The dataset was collected from the Chinese microblog Weibo. We also compared different deep-learning methods with pretrained language representation models in two settings: (1) publicly released pretrained language representation models, and (2) language representation models further pretrained on a large-scale unlabeled dataset collected from Weibo. Precision, recall, and F1 scores were used as performance evaluation measures. RESULTS Among the three deep-learning methods, BERT achieved the best performance with a microaveraged F1 score of 0.856. RoBERTa achieved the best performance with a macroaveraged F1 score of 0.424 on depression risk at levels 1, 2, and 3, which represents a new benchmark result on the dataset. The further pretrained language representation models demonstrated improvement over publicly released prediction models. CONCLUSIONS We applied deep-learning methods with pretrained language representation models to automatically predict depression risk using data from Chinese microblogs. The experimental results showed that the deep-learning methods performed better than previous methods, and have greater potential to discover patients with depression and to trace their mental health conditions.

Download Full-text

Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis

JMIR Medical Informatics ◽

10.2196/17958 ◽

2020 ◽

Vol 8 (7) ◽

pp. e17958

Author(s):

Xiaofeng Wang ◽

Shuai Chen ◽

Tao Li ◽

Wanting Li ◽

Yejie Zhou ◽

...

Keyword(s):

Mental Health ◽

Social Media ◽

Deep Learning ◽

Risk Prediction ◽

Health Conditions ◽

Learning Methods ◽

Mental Health Conditions ◽

Depression Risk ◽

Language Representation ◽

Using Data

Background Depression is a serious personal and public mental health problem. Self-reporting is the main method used to diagnose depression and to determine the severity of depression. However, it is not easy to discover patients with depression owing to feelings of shame in disclosing or discussing their mental health conditions with others. Moreover, self-reporting is time-consuming, and usually leads to missing a certain number of cases. Therefore, automatic discovery of patients with depression from other sources such as social media has been attracting increasing attention. Social media, as one of the most important daily communication systems, connects large quantities of people, including individuals with depression, and provides a channel to discover patients with depression. In this study, we investigated deep-learning methods for depression risk prediction using data from Chinese microblogs, which have potential to discover more patients with depression and to trace their mental health conditions. Objective The aim of this study was to explore the potential of state-of-the-art deep-learning methods on depression risk prediction from Chinese microblogs. Methods Deep-learning methods with pretrained language representation models, including bidirectional encoder representations from transformers (BERT), robustly optimized BERT pretraining approach (RoBERTa), and generalized autoregressive pretraining for language understanding (XLNET), were investigated for depression risk prediction, and were compared with previous methods on a manually annotated benchmark dataset. Depression risk was assessed at four levels from 0 to 3, where 0, 1, 2, and 3 denote no inclination, and mild, moderate, and severe depression risk, respectively. The dataset was collected from the Chinese microblog Weibo. We also compared different deep-learning methods with pretrained language representation models in two settings: (1) publicly released pretrained language representation models, and (2) language representation models further pretrained on a large-scale unlabeled dataset collected from Weibo. Precision, recall, and F1 scores were used as performance evaluation measures. Results Among the three deep-learning methods, BERT achieved the best performance with a microaveraged F1 score of 0.856. RoBERTa achieved the best performance with a macroaveraged F1 score of 0.424 on depression risk at levels 1, 2, and 3, which represents a new benchmark result on the dataset. The further pretrained language representation models demonstrated improvement over publicly released prediction models. Conclusions We applied deep-learning methods with pretrained language representation models to automatically predict depression risk using data from Chinese microblogs. The experimental results showed that the deep-learning methods performed better than previous methods, and have greater potential to discover patients with depression and to trace their mental health conditions.

Download Full-text

Erratum: Corrigendum: Characterisation of mental health conditions in social media using Informed Deep Learning

Scientific Reports ◽

10.1038/srep46813 ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 3

Author(s):

George Gkotsis ◽

Anika Oellrich ◽

Sumithra Velupillai ◽

Maria Liakata ◽

Tim J. P. Hubbard ◽

...

Keyword(s):

Mental Health ◽

Social Media ◽

Deep Learning ◽

Health Conditions ◽

Mental Health Conditions

Download Full-text

Faculty of 1000 evaluation for Corrigendum: Characterisation of mental health conditions in social media using Informed Deep Learning.

F1000 - Post-publication peer review of the biomedical literature ◽

10.3410/f.727623562.793537297 ◽

2017 ◽

Author(s):

Nigam Shah ◽

Tavpritesh Sethi

Keyword(s):

Mental Health ◽

Social Media ◽

Deep Learning ◽

Health Conditions ◽

Mental Health Conditions

Download Full-text

Faculty Opinions recommendation of Characterisation of mental health conditions in social media using Informed Deep Learning.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.727448907.793537297 ◽

2017 ◽

Author(s):

Nigam Shah ◽

Tavpritesh Sethi

Keyword(s):

Mental Health ◽

Social Media ◽

Deep Learning ◽

Health Conditions ◽

Mental Health Conditions

Download Full-text

Characterisation of mental health conditions in social media using Informed Deep Learning

Scientific Reports ◽

10.1038/srep45141 ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 33

Author(s):

George Gkotsis ◽

Anika Oellrich ◽

Sumithra Velupillai ◽

Maria Liakata ◽

Tim J. P. Hubbard ◽

...

Keyword(s):

Mental Health ◽

Social Media ◽

Mental Illness ◽

Deep Learning ◽

Well Being ◽

Health Conditions ◽

Targeted Interventions ◽

Mental Health Conditions ◽

Life Years ◽

Health And Social Care

Abstract The number of people affected by mental illness is on the increase and with it the burden on health and social care use, as well as the loss of both productivity and quality-adjusted life-years. Natural language processing of electronic health records is increasingly used to study mental health conditions and risk behaviours on a large scale. However, narrative notes written by clinicians do not capture first-hand the patients’ own experiences, and only record cross-sectional, professional impressions at the point of care. Social media platforms have become a source of ‘in the moment’ daily exchange, with topics including well-being and mental health. In this study, we analysed posts from the social media platform Reddit and developed classifiers to recognise and classify posts related to mental illness according to 11 disorder themes. Using a neural network and deep learning approach, we could automatically recognise mental illness-related posts in our balenced dataset with an accuracy of 91.08% and select the correct theme with a weighted average accuracy of 71.37%. We believe that these results are a first step in developing methods to characterise large amounts of user-generated content that could support content curation and targeted interventions.

Download Full-text

Adapting Deep Learning Methods for Mental Health Prediction on Social Media

10.18653/v1/d19-5542 ◽

2019 ◽

Author(s):

Ivan Sekulic ◽

Michael Strube

Keyword(s):

Mental Health ◽

Social Media ◽

Deep Learning ◽

Learning Methods

Download Full-text

Forecasting mental health and emotions based on social media expressions during the COVID-19 pandemic

Information Discovery and Delivery ◽

10.1108/idd-01-2021-0003 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Antonela Tommasel ◽

Andrés Diaz-Pace ◽

Juan Manuel Rodriguez ◽

Daniela Godoy

Keyword(s):

Mental Health ◽

Social Media ◽

Time Series ◽

Mental States ◽

Health Conditions ◽

Content Type ◽

Mental Health Conditions ◽

High Prevalence ◽

Build Time ◽

Communication Policies

Purpose The purpose of this paper is to present an approach for forecasting mental health conditions and emotions of a given population during the COVID-19 pandemic in Argentina based on social media contents. Design/methodology/approach Mental health conditions and emotions are captured via markers, which link social media contents with lexicons. First, the authors build time series models that describe the evolution of markers and their correlation with crisis events. Second, the authors use the time series for forecasting markers and identifying high prevalence points for the estimated markers. Findings The authors evaluated different forecasting strategies that yielded different performance and capabilities. In the best scenario, high prevalence periods of emotions and mental health issues can be satisfactorily predicted with a neural network strategy, even at early stages of a crisis (e.g. a training period of seven days). Practical implications This work contributes to a better understanding of how psychological processes related to crises manifest in social media, and this is a valuable asset for the design, implementation and monitoring of health prevention and communication policies. Originality/value Although there have been previous efforts to predict mental states of individuals, the analysis of mental health at the collective level has received scarce attention. The authors take a step forward by proposing a forecasting approach for analyzing the mental health of a given population at a larger scale.

Download Full-text

Analysis of Social Media Posts for Early Detection of Mental Health Conditions

Advances in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-319-89656-4_11 ◽

2018 ◽

pp. 133-143 ◽

Cited By ~ 1

Author(s):

Antoine Briand ◽

Hayda Almeida ◽

Marie-Jean Meurs

Keyword(s):

Mental Health ◽

Social Media ◽

Early Detection ◽

Health Conditions ◽

Mental Health Conditions

Download Full-text

Association of Social Media Use With Mental Health Conditions of Nonpatients During the COVID-19 Outbreak: Insights from a National Survey Study (Preprint)

10.2196/preprints.23696 ◽

2020 ◽

Author(s):

Bu Zhong ◽

Zhibin Jiang ◽

Wenjing Xie ◽

Xuebing Qin

Keyword(s):

Mental Health ◽

Social Media ◽

General Population ◽

Psychiatric Disorders ◽

Health Information ◽

Media Use ◽

Information Support ◽

Health Conditions ◽

Social Media Use ◽

Mental Health Conditions

BACKGROUND Considerable research has been devoted to examining the mental health conditions of patients with COVID-19 and medical staff attending to these patients during the COVID-19 pandemic. However, there are few insights concerning how the pandemic may take a toll on the mental health of the general population, and especially of nonpatients (ie, individuals who have not contracted COVID-19). OBJECTIVE This study aimed to investigate the association between social media use and mental health conditions in the general population based on a national representative sample during the peak of the COVID-19 outbreak in China. METHODS We formed a national representative sample (N=2185) comprising participants from 30 provinces across China, who were the first to experience the COVID-19 outbreak in the world. We administered a web-based survey to these participants to analyze social media use, health information support received via social media, and possible psychiatric disorders, including secondary traumatic stress (STS) and vicarious trauma (VT). RESULTS Social media use did not cause mental health issues, but it mediated the levels of traumatic emotions among nonpatients. Participants received health information support via social media, but excessive social media use led to elevated levels of stress (β=.175; P<.001), anxiety (β=.224; P<.001), depression (β=.201; P<.001), STS (β=.307; P<.001), and VT (β=.688; P<.001). Geographic location (or geolocation) and lockdown conditions also contributed to more instances of traumatic disorders. Participants living in big cities were more stressed than those living in rural areas (P=.02). Furthermore, participants from small cities or towns were more anxious (P=.01), stressed (P<.001), and depressed (P=.008) than those from rural areas. Obtaining more informational support (β=.165; P<.001) and emotional support (β=.144; P<.001) via social media increased their VT levels. Peer support received via social media increased both VT (β=.332; P<.001) and STS (β=.130; P<.001) levels. Moreover, geolocation moderated the relationships between emotional support on social media and VT (F2=3.549; P=.029) and the association between peer support and STS (F2=5.059; P=.006). Geolocation also interacted with health information support in predicting STS (F2=5.093; P=.006). CONCLUSIONS COVID-19 has taken a severe toll on the mental health of the general population, including individuals who have no history of psychiatric disorders or coronavirus infection. This study contributes to the literature by establishing the association between social media use and psychiatric disorders among the general public during the COVID-19 outbreak. The study findings suggest that the causes of such psychiatric disorders are complex and multifactorial, and social media use is a potential factor. The findings also highlight the experiences of people in China and can help global citizens and health policymakers to mitigate the effects of psychiatric disorders during this and other public health crises, which should be regarded as a key component of a global pandemic response.

Download Full-text

Multitask Learning for Mental Health Conditions with Limited Social Media Data

10.18653/v1/e17-1015 ◽

2017 ◽

Cited By ~ 24

Author(s):

Adrian Benton ◽

Margaret Mitchell ◽

Dirk Hovy

Keyword(s):

Mental Health ◽

Social Media ◽

Multitask Learning ◽

Health Conditions ◽

Social Media Data ◽

Mental Health Conditions ◽

Media Data

Download Full-text