BACKGROUND
Drug abuse is a severe challenge for the United States. Social media such as Reddit has become one of the most efficient platforms for drug users to share experiences and communicate with each other. By analyzing users’ perceptions and behavior patterns concerning causes and symptoms of drug usage on social media, public health researchers and agencies can have innovative real time situational awareness and surveillance capabilities to tackle the drug abuse crisis.
OBJECTIVE
This paper aims to develop a social media-based approach to analyze users’ causes and symptoms of drug usage across different age groups, then deeply gain insights into users’ behavior pattern.
METHODS
We collected 163,610 posts on the Reddit /r/drugs community from February 2008 to December 2017. Firstly, topics for drugs, causes, and symptoms based on word vector learning (word2vec) were extracted. Then, we designed a method to automatically extract age information from posts. Finally, the relationships between age and drugs, causes, and symptoms were established.
RESULTS
We found that: (1) Drug topics contained 6 categories including alcoholic, tobaccos, prescriptions, hallucinogens, ecstasies, and other highly addictive drugs. Tobaccos (n=11,947), hallucinogens (n=9244), and other highly addictive drugs (n=5733) were the most discussed drug categories. (2) 9646 (65% of the total posts that had age information) posts on Reddit were sent by users between 15 and 25 years old. There was relevance between age and drug addiction. People tended to change their drug usage from a primary drug (e.g. marijuana) to other highly addictive drugs (e.g. heroin) as they aged. The age groups age<15 and 15<=age<20, 20<=age<25 had very similar drug topic patterns, where the Pearson correlation coefficient (PCC) between age<15 and 15<=age<20 is .994 (P<.001), the Pearson correlation coefficient (PCC) between age<15 and 20<=age<25 is .985 (P<.001). Similar phenomena were also obtained for age groups 20<=age<25 and 25<=age<30, where the Pearson correlation coefficient (PCC) is .989 (P <.001). (3)The reasons users take different types of drugs can be classified as curious, anxious and pain. The most frequently mentioned cause is curiosity about a novel experience (1215, 1215/1848), which may due to the characteristics of young users. For the curiosity cause, the ratios of posts discussing tobaccos and hallucinogens were 0.5045 (613/1215) and 0.3975 (483/1215) respectively. (4)The symptoms caused by taking drugs can be classified into 6 topics, including hallucination, comfort, lose_contorl, affect_brain, anxious and pain. Anxiety was the most popular symptom mentioned among the posts (574, 574/1096). The number of posts reporting anxious symptom caused by drugs is a little larger than that of posts discussing curing anxiety by drugs (n=558).
CONCLUSIONS
This is the first study aimed at deeply gaining insights into users’ behavior patterns, which can help public health researchers and agencies provide personalized health regulatory services for people at different ages.