BACKGROUND
Previous qualitative studies and data science studies using Reddit for tobacco research are limited by the lack of available demographic information. Social media investigations are often limited to manual qualitative coding or machine learning classification in isolation.
OBJECTIVE
This study combines both machine learning methods and manual qualitative coding to provide contextual age nuance to social media analysis. By being able to predict a Redditor’s age using publicly available data, the most popular posts can be analyzed and qualitatively coded to provide nuanced comparisons on thematic topics by age group.
METHODS
The current study combines these two methods to 1) predict Reddit users’ age into two categories (13-20, 21-54) and 2) qualitatively code Electronic Nicotine Delivery System [ENDS] related Reddit posts within the two age groups. We identified Reddit posts on three topics: Vaping in General, Tobacco 21 Minimum Age Laws, and Flavor Restriction Policies. An age algorithm was used to predict Reddit users’ ages (13-20 or 21-54 year old users). The 25 posts with the highest karma score (number of upvotes minus number of downvotes) for each query and each predicted age group were qualitatively coded.
RESULTS
The top three, two of which were part of the query, out of nine, topics that emerged were “Flavor Restriction Policies”, “Tobacco 21 Policies”, and “Use”. Tobacco 21 and Flavor Restriction Policy posts were prominent coding categories. Opposition to flavor restriction policies was a prominent sub-category for both groups, but more common in the 21-54 group. The 13-20 group was more likely to discuss opposition to minimum age laws as well as access to flavored ENDS products. The 21-54 group more commonly mentioned general vaping use behavior.
CONCLUSIONS
Users predicted to be in the 13-20 age group posted about different ENDS-related topics on Reddit than users predicted to be in the 21-54 age group. Future studies could use these complementary methods with social media data to gain insights from target audiences.