Studies of depression and anxiety using Reddit as a data source: Scoping review (Preprint)

2021 ◽  
Author(s):  
Nick Boettcher

BACKGROUND The study of depression and anxiety using publicly available social media data is a research activity that has grown considerably over the last decade. The discussion platform Reddit has become a popular social media data source in this nascent area of study, in part because of the unique ways in which the platform is facilitative of research. To date, no work has been done to synthesize existing studies of depression and anxiety using Reddit. OBJECTIVE The objective of this review is to understand the scope and nature of research using Reddit as a primary data source for studying depression and anxiety. METHODS A scoping review was conducted using the Arksey and O’Malley framework. Academic databases searched include MEDLINE/PubMed, EMBASE, CINAHL, PsycINFO, PsycARTICLES, Scopus, ScienceDirect, IEEE Xplore, and ACM database. Inclusion criteria were developed using the Participants/Concept/Context framework outlined by the Joanna Briggs Institute Scoping Review Methodology Group. Eligible studies featured a methodological focus on analyzing depression and/or anxiety using naturalistic written expressions from Reddit users as the primary data source. RESULTS 54 Studies were included for review. Tables and corresponding analysis delineate key methodological features including a comparatively larger focus on depression versus anxiety, an even split of original and premade datasets, a favored analytic focus on classifying the mental health states of Reddit users, and practical implications often recommending new methods of professionally-driven mental health monitoring and outreach for Reddit users. CONCLUSIONS Studies of depression and anxiety using Reddit data are currently driven by a prevailing methodology which favors a technical, solution-based orientation. Researchers interested in advancing this research area will benefit from further consideration of conceptual issues surrounding interpretation of Reddit data with the medical model of mental health. Further efforts are also needed to locate accountability and autonomy within practice implications suggesting new forms of engagement with Reddit users.

2020 ◽  
Vol 3 (1) ◽  
pp. 433-458 ◽  
Author(s):  
Rion Brattig Correia ◽  
Ian B. Wood ◽  
Johan Bollen ◽  
Luis M. Rocha

Social media data have been increasingly used to study biomedical and health-related phenomena. From cohort-level discussions of a condition to population-level analyses of sentiment, social media have provided scientists with unprecedented amounts of data to study human behavior associated with a variety of health conditions and medical treatments. Here we review recent work in mining social media for biomedical, epidemiological, and social phenomena information relevant to the multilevel complexity of human health. We pay particular attention to topics where social media data analysis has shown the most progress, including pharmacovigilance and sentiment analysis, especially for mental health. We also discuss a variety of innovative uses of social media data for health-related applications as well as important limitations of social media data access and use.


2021 ◽  
Author(s):  
Vishal Dey ◽  
Peter Krasniak ◽  
Minh Nguyen ◽  
Clara Lee ◽  
Xia Ning

BACKGROUND A new illness can come to public attention through social media before it is medically defined, formally documented, or systematically studied. One example is a condition known as breast implant illness (BII), which has been extensively discussed on social media, although it is vaguely defined in the medical literature. OBJECTIVE The objective of this study is to construct a data analysis pipeline to understand emerging illnesses using social media data and to apply the pipeline to understand the key attributes of BII. METHODS We constructed a pipeline of social media data analysis using natural language processing and topic modeling. Mentions related to signs, symptoms, diseases, disorders, and medical procedures were extracted from social media data using the clinical Text Analysis and Knowledge Extraction System. We mapped the mentions to standard medical concepts and then summarized these mapped concepts as topics using latent Dirichlet allocation. Finally, we applied this pipeline to understand BII from several BII-dedicated social media sites. RESULTS Our pipeline identified topics related to toxicity, cancer, and mental health issues that were highly associated with BII. Our pipeline also showed that cancers, autoimmune disorders, and mental health problems were emerging concerns associated with breast implants, based on social media discussions. Furthermore, the pipeline identified mentions such as rupture, infection, pain, and fatigue as common self-reported issues among the public, as well as concerns about toxicity from silicone implants. CONCLUSIONS Our study could inspire future studies on the suggested symptoms and factors of BII. Our study provides the first analysis and derived knowledge of BII from social media using natural language processing techniques and demonstrates the potential of using social media information to better understand similar emerging illnesses. CLINICALTRIAL


Author(s):  
Mohamad Hasan

This paper presents a model to collect, save, geocode, and analyze social media data. The model is used to collect and process the social media data concerned with the ISIS terrorist group (the Islamic State in Iraq and Syria), and to map the areas in Syria most affected by ISIS accordingly to the social media data. Mapping process is assumed automated compilation of a density map for the geocoded tweets. Data mined from social media (e.g., Twitter and Facebook) is recognized as dynamic and easily accessible resources that can be used as a data source in spatial analysis and geographical information system. Social media data can be represented as a topic data and geocoding data basing on the text of the mined from social media and processed using Natural Language Processing (NLP) methods. NLP is a subdomain of artificial intelligence concerned with the programming computers to analyze natural human language and texts. NLP allows identifying words used as an initial data by developed geocoding algorithm. In this study, identifying the needed words using NLP was done using two corpora. First corpus contained the names of populated places in Syria. The second corpus was composed in result of statistical analysis of the number of tweets and picking the words that have a location meaning (i.e., schools, temples, etc.). After identifying the words, the algorithm used Google Maps geocoding API in order to obtain the coordinates for posts.


BMJ Open ◽  
2018 ◽  
Vol 8 (12) ◽  
pp. e022931 ◽  
Author(s):  
Joanna Taylor ◽  
Claudia Pagliari

IntroductionThe rising popularity of social media, since their inception around 20 years ago, has been echoed in the growth of health-related research using data derived from them. This has created a demand for literature reviews to synthesise this emerging evidence base and inform future activities. Existing reviews tend to be narrow in scope, with limited consideration of the different types of data, analytical methods and ethical issues involved. There has also been a tendency for research to be siloed within different academic communities (eg, computer science, public health), hindering knowledge translation. To address these limitations, we will undertake a comprehensive scoping review, to systematically capture the broad corpus of published, health-related research based on social media data. Here, we present the review protocol and the pilot analyses used to inform it.MethodsA version of Arksey and O’Malley’s five-stage scoping review framework will be followed: (1) identifying the research question; (2) identifying the relevant literature; (3) selecting the studies; (4) charting the data and (5) collating, summarising and reporting the results. To inform the search strategy, we developed an inclusive list of keyword combinations related to social media, health and relevant methodologies. The frequency and variability of terms were charted over time and cross referenced with significant events, such as the advent of Twitter. Five leading health, informatics, business and cross-disciplinary databases will be searched: PubMed, Scopus, Association of Computer Machinery, Institute of Electrical and Electronics Engineers and Applied Social Sciences Index and Abstracts, alongside the Google search engine. There will be no restriction by date.Ethics and disseminationThe review focuses on published research in the public domain therefore no ethics approval is required. The completed review will be submitted for publication to a peer-reviewed, interdisciplinary open access journal, and conferences on public health and digital research.


Author(s):  
F. O. Ostermann ◽  
H. Huang ◽  
G. Andrienko ◽  
N. Andrienko ◽  
C. Capineri ◽  
...  

Increasing availability of Geo-Social Media (e.g. Facebook, Foursquare and Flickr) has led to the accumulation of large volumes of social media data. These data, especially geotagged ones, contain information about perception of and experiences in various environments. Harnessing these data can be used to provide a better understanding of the semantics of places. We are interested in the similarities or differences between different Geo-Social Media in the description of places. This extended abstract presents the results of a first step towards a more in-depth study of semantic similarity of places. Particularly, we took places extracted through spatio-temporal clustering from one data source (Twitter) and examined whether their structure is reflected semantically in another data set (Flickr). Based on that, we analyse how the semantic similarity between places varies over space and scale, and how Tobler's first law of geography holds with regards to scale and places.


2021 ◽  
Author(s):  
Koustuv Saha ◽  
Asra Yousuf ◽  
Ryan L. Boyd ◽  
James W. Pennebaker ◽  
Munmun Choudhury

Abstract The mental health of college students is a growing concern, and gauging the mental health needs of college students is difficult to assess in real-time and in scale. While social media has shown potential as a viable “passive sensor” of mental health, the construct validity and in-practice reliability of such computational assessments remain largely unexplored. Towards this goal, we study how assessing the mental health of college students using social media data correspond with ground-truth data of on-campus mental health consultations. For a large U.S. public university, we obtained ground-truth data of on-campus mental health consultations between 2011–2016, and collected 66,000 posts from the university’s Reddit community. We adopted machine learning and natural language methodologies to measure symptomatic mental health expressions of depression, anxiety, stress, suicidal ideation, and psychosis on the social media data. Seasonal auto-regressive integrated moving average (SARIMA) models of forecasting on-campus mental health consultations showed that incorporating social media data led to predictions with r=0.86 and SMAPE=13.30, outperforming models without social media data by 41%. Our language analyses revealed that social media discussions during high mental health consultations months consisted of discussions on academics and career, whereas months of low mental health consultations saliently show expressions of positive affect, collective identity, and socialization. This study reveals that social media data can improve our understanding of college students’ mental health, particularly their mental health treatment needs.


10.2196/26119 ◽  
2021 ◽  
Vol 23 (8) ◽  
pp. e26119
Author(s):  
Guanghui Fu ◽  
Changwei Song ◽  
Jianqiang Li ◽  
Yue Ma ◽  
Pan Chen ◽  
...  

Background Web-based social media provides common people with a platform to express their emotions conveniently and anonymously. There have been nearly 2 million messages in a particular Chinese social media data source, and several thousands more are generated each day. Therefore, it has become impossible to analyze these messages manually. However, these messages have been identified as an important data source for the prevention of suicide related to depression disorder. Objective We proposed in this paper a distant supervision approach to developing a system that can automatically identify textual comments that are indicative of a high suicide risk. Methods To avoid expensive manual data annotations, we used a knowledge graph method to produce approximate annotations for distant supervision, which provided a basis for a deep learning architecture that was built and refined by interactions with psychology experts. There were three annotation levels, as follows: free annotations (zero cost), easy annotations (by psychology students), and hard annotations (by psychology experts). Results Our system was evaluated accordingly and showed that its performance at each level was promising. By combining our system with several important psychology features from user blogs, we obtained a precision of 80.75%, a recall of 75.41%, and an F1 score of 77.98% for the hardest test data. Conclusions In this paper, we proposed a distant supervision approach to develop an automatic system that can classify high and low suicide risk based on social media comments. The model can therefore provide volunteers with early warnings to prevent social media users from committing suicide.


2020 ◽  
Author(s):  
Koustuv Saha ◽  
John Torous ◽  
Eric D. Caine ◽  
Munmun De Choudhury

AbstractBackgroundThe novel coronavirus disease 2019 (COVID-19) pandemic has caused several disruptions in personal and collective lives worldwide. The uncertainties surrounding the pandemic have also led to multi-faceted mental health concerns, which can be exacerbated with precautionary measures such as social distancing and self-quarantining, as well as societal impacts such as economic downturn and job loss. Despite noting this as a “mental health tsunami,” the psychological effects of the COVID-19 crisis remains unexplored at scale. Consequently, public health stakeholders are currently limited in identifying ways to provide timely and tailored support during these circumstances.ObjectiveOur work aims to provide insights regarding people’s psychosocial concerns during the COVID-19 pandemic by leveraging social media data. We aim to study the temporal and linguistic changes in symptomatic mental health and support expressions in the pandemic context.MethodsWe obtain ∼60M Twitter streaming posts originating from the U.S. from 24 March-24 May 2020, and compare these with ∼40M posts from a comparable period in 2019 to attribute the effect of COVID-19 on people’s social media self-disclosure. Using these datasets, we study people’s self-disclosure on social media in terms of symptomatic mental health concerns and expressions of support. We employ transfer learning classifiers that identify the social media language indicative of mental health outcomes (anxiety, depression, stress, and suicidal ideation) and support (emotional and informational support). We then examine the changes in psychosocial expressions over time and language, comparing the 2020 and 2019 datasets.ResultsWe find that all of the examined psychosocial expressions have significantly increased during the COVID-19 crisis – mental health symptomatic expressions have increased by ∼14%, and support expressions have increased by ∼5%, both thematically related to COVID-19. We also observe a steady decline and eventual plateauing in these expressions during the COVID-19 pandemic, which may have been due to habituation or due to supportive policy measures enacted during this period. Our language analyses highlight that people express concerns that are very specific to and contextually related to the COVID-19 crisis.ConclusionsWe studied the psychosocial effects of the COVID-19 crisis by using social media data from 2020, finding that people’s mental health symptomatic and support expressions significantly increased during the COVID-19 period as compared to similar data from 2019. However, this effect gradually lessened over time, suggesting that people adapted to the circumstances and their “new normal”. Our linguistic analyses revealed that people expressed mental health concerns regarding personal and professional challenges, healthcare and precautionary measures, and pandemic-related awareness. This work shows the potential to provide insights to mental healthcare and stakeholders and policymakers in planning and implementing measures to mitigate mental health risks amidst the health crisis.


2021 ◽  
Author(s):  
Su Golder ◽  
Robin Stevens ◽  
Karen O'Conor ◽  
Richard James ◽  
Graciela Gonzalez-Hernandez

BACKGROUND Background: A growing amount of health research uses social media data. Those critical of social media research often cite that it may be unrepresentative of the population, but the suitability of social media data in digital epidemiology is more nuanced. Identifying the demographics of social media users can help establish representativeness. OBJECTIVE Objectives: We sought to identify the different approaches or combination of approaches to extract race or ethnicity from social media and report on the challenges of using these methods. METHODS Methods: We present a scoping review to identify the methods used to extract race or ethnicity from Twitter datasets. We searched 17 electronic databases and carried out reference checking and handsearching in order to identify relevant articles. Sifting of each record was undertaken independently by at least two researchers with any disagreement discussed. The included studies could be categorized by the methods the authors applied to extract race or ethnicity. RESULTS Results: From 1249 records we identified 67 that met our inclusion criteria. The majority focus on US based users and English language tweets. A range of types of data were used including Twitter profile -pictures or information from bios (such as names or self-declarations), or location and/or content in the tweets themselves. A range of methodologies were used including using manual inference, linkage to census data, commercial software, language/dialect recognition and machine learning. Not all studies evaluated their methods. Those that did found accuracy to vary from 45% to 93% with significantly lower accuracy identifying non-white race categories. The inference of race/ethnicity raises important ethical questions which can be exacerbated by the data and methods used. The comparative accuracy of different methods is also largely unknown. CONCLUSIONS Conclusion: There is no standard accepted approach or current guidelines for extracting or inferring race or ethnicity of Twitter users. Social media researchers must use careful interpretation of race or ethnicity and not over-promise what can be achieved, as even manual screening is a subjective, imperfect method. Future research should establish the accuracy of methods to inform evidence-based best practice guidelines for social media researchers, and be guided by concerns of equity and social justice.


Aksara ◽  
2021 ◽  
Vol 32 (2) ◽  
pp. 323-338
Author(s):  
Hari Kusmanto ◽  
Nadia Puji Ayu ◽  
Harun Joko Prayitno ◽  
Laili Etika Rahmawati ◽  
Dini Restiyanti Pratiwi ◽  
...  

Abstrak Studi ini bertujuan mendeskripsikan wujud kesantunan berkomunikasi dalam media sosial WhatsApp antara mahasiswa dan dosen. Studi ini adalah kualitatif. Data dalam studi ini adalah kalimat-kalimat santun dalam wacana akademik di media sosial. Sumber data dalam studi ini adalah tuturan wacana akademik di media sosial. Pengumpulan data dalam studi ini menggunakan metode dokumentasi, simak, dan dilanjutkan dengan teknik catat. Analisis data dalam studi ini dilakukan dengan metode padan intralingual; padan pragmatis dan diperkuat dengan teknik analisis kesantunan Brown dan Levinson berperspektif humanis. Hasil studi ini menunjukkan tindak kesantunan positif meliputi: (1) mengucapkan terima kasih sebagai penghormatan kepada mitra tutur, 48%; (2) memberikan pertanyaan sebagai wujud perhatian kepada mitra tutur, 8%; (3) memberikan informasi kepada mitra tutur sebagai wujud kepedulian, 18%; (4) menunjukkan keoptimisan kepada mitra tutur supaya termotivasi, 4%; (5) memberikan hadiah kepada mitra tutur dengan memberikan dukungan, 4%; (6) mengucapkan salam kepada mitra tutur sebagai upaya mendoakan kebaikan kepada mitra tutur, 8%; dan (7) menggunakan penanda identitas sebagai wujud menjalin solidaritas antara penutur dan mitra tutur, 10%. Hal ini menunjukkan mahasiswa memiliki sikap penghormatan yang tinggi kepada dosen dengan menunjukkan komunikasi bernada positif. Tindak kesantunan mengucapkan terima kasih, memberikan informasi yang dibutuhkan mitra tutur, menunjukkan sikap percaya diri, mengucapkan salam merupakan wujud komunikasi yang berperspektif humanis, yakni menjunjung nilai-nilai kemanusian. Penelitian ini bermanfaat dalam membangun komunikasi pembelajaran yang berorientasi pada kesantunan berbahasa yang memartabatkan nilai-nilai humanitas dalam pembelajaran. Kata kunci: kesantunan positif, akademik, media sosial, humanis Abstract This study aims to describe the form of politeness in communicating on WhatsApp social media between students and lecturers. This study is qualitative. The data in this study are polite sentences in academic discourse on social media. The data source in this study is the speech of academic discourse on social media. Data collection in this study uses the documentation method, refer to it, and proceed with note taking technique. Data analysis in this study was carried out using the intralingual equivalent method; pragmatic equivalent and strengthened by Brown and Levinson’s politeness analysis techniques with a sweet perspective. The results of this study show positive politeness actions include: (1) Thank you for the speech partner observer 48%; (2) giving questions as a form of attention to the speech partners 8%; (3) providing information to the speech partners as a form of concern 18%; (4) showing optimism for the speech partners to be motivated 4%; (5) giving gifts to speech partners by giving support 4%; (6) greeting the speech partners in an effort to pray for the kindness of the speech partners 8%; and (7) using identity markers as a form of establishing solidarity between the speaker and the speech partner 10%.. ISSN 0854-3283 (Print), ISSN 2580-0353 (Online) , Vol. 32, No. 2, Desember 2020 323 Realisasi Tindak Kesantunan Positif dalam Wacana Akademik di Media Sosial Berperspektif Humanitas Halaman 323 — 338 (Hari Kusmanto, Nadia P. Ayu, Harun J. Prayitno, Laili E. Rahmawati, Dini R. Pratiwi, dan Tri Santoso) This shows students have a high attitude of respect for lecturers by showing positive communication. Actions of thanksgiving, giving information needed by the speech partner, showing self-con dence, greeting is a form of communication with a humanist perspective, namely upholding human values. This research is useful in building learning communication that is oriented towards language politeness that digni es human values in learning. Keywords: positive politeness, academic, social media, humanity 


Sign in / Sign up

Export Citation Format

Share Document