BACKGROUND
Clinical data present in social media is an underused source of information with great potential to allow for a deeper understanding of patient values, attitudes and preferences.
OBJECTIVE
We describe a novel and broadly applicable method for sentiment analysis and emotion detection to free text from online medical health forums and the factors to consider during its application.
METHODS
We mined the full discussion and user information of all posts containing search terms related to a specific medical subspecialty (oculoplastics) from MedHelp, the largest online platform for patient health forums. We employed a variety of data cleaning and processing to define the relevant subset of results and prepare those results for sentiment analysis. We executed sentiment and emotion analysis through IBM Watson Natural Language Understanding service to generate sentiment and emotion scores for the posts and their associated keywords. Keywords were aggregated using natural language processing tools.
RESULTS
39 oculoplastics-related search terms resulted in 46,381 eligible posts within 14,329 threads, written by 18,319 users (117 doctors; 18,202 patients) and 201,611 associated keywords. Keywords that occurred ≥500 times in the corpus were used to identify most prominent topics, including specific symptoms, medication and complications. The sentiment and emotion scores of these keywords and eligible posts were further analyzed to provide concrete examples of the methodology’s potential to allow better understanding of patients’ attitudes.
CONCLUSIONS
This comprehensive report allows physicians and researchers to efficiently mine and perform sentiment analysis on social media to better understand patients’ perspectives and promote patient-centric care. Important factors to be considered during application include evaluating the scope of the search, selecting search terms and understanding their different linguistic usages, and establishing robust selection, filtering and processing criteria for posts and keywords tailored to the results.