A Rule-Based Sentiment Classification Framework for Health Reviews on Mobile Social Media

BACKGROUND The COVID-19 pandemic has created a pressing need for integrating information from disparate sources, in order to assist decision makers. Social media is important in this respect, however, to make sense of the textual information it provides and be able to automate the processing of large amounts of data, natural language processing methods are needed. Social media posts are often noisy, yet they may provide valuable insights regarding the severity and prevalence of the disease in the population. In particular, machine learning techniques for triage and diagnosis could allow for a better understanding of what social media may offer in this respect. OBJECTIVE This study aims to develop an end-to-end natural language processing pipeline for triage and diagnosis of COVID-19 from patient-authored social media posts, in order to provide researchers and other interested parties with additional information on the symptoms, severity and prevalence of the disease. METHODS The text processing pipeline first extracts COVID-19 symptoms and related concepts such as severity, duration, negations, and body parts from patients’ posts using conditional random fields. An unsupervised rule-based algorithm is then applied to establish relations between concepts in the next step of the pipeline. The extracted concepts and relations are subsequently used to construct two different vector representations of each post. These vectors are applied separately to build support vector machine learning models to triage patients into three categories and diagnose them for COVID-19. RESULTS We report that Macro- and Micro-averaged F_{1\ }scores in the range of 71-96% and 61-87%, respectively, for the triage and diagnosis of COVID-19, when the models are trained on human labelled data. Our experimental results indicate that similar performance can be achieved when the models are trained using predicted labels from concept extraction and rule-based classifiers, thus yielding end-to-end machine learning. Also, we highlight important features uncovered by our diagnostic machine learning models and compare them with the most frequent symptoms revealed in another COVID-19 dataset. In particular, we found that the most important features are not always the most frequent ones. CONCLUSIONS Our preliminary results show that it is possible to automatically triage and diagnose patients for COVID-19 from natural language narratives using a machine learning pipeline, in order to provide additional information on the severity and prevalence of the disease through the eyes of social media.

Download Full-text

A Joint Segmentation and Classification Framework for Sentence Level Sentiment Classification

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2015.2449071 ◽

2015 ◽

Vol 23 (11) ◽

pp. 1750-1761 ◽

Cited By ~ 30

Author(s):

Duyu Tang ◽

Bing Qin ◽

Furu Wei ◽

Li Dong ◽

Ting Liu ◽

...

Keyword(s):

Sentiment Classification ◽

Classification Framework ◽

Sentence Level

Download Full-text

Sentiment Classification Based on Integration of Rule-Based Method and Machine Learning with Image Sentiment Recognition

Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-030-70665-4_33 ◽

2021 ◽

pp. 286-293

Author(s):

Xinyuan Chen ◽

Shengyi Xie ◽

Qingqiang Chen

Keyword(s):

Machine Learning ◽

Sentiment Classification ◽

Rule Based

Download Full-text

A Robust Rule-Based Ensemble Framework Using Mean-Shift Segmentation for Hyperspectral Image Classification

Remote Sensing ◽

10.3390/rs11172057 ◽

2019 ◽

Vol 11 (17) ◽

pp. 2057 ◽

Cited By ~ 3

Author(s):

Majid Shadman Roodposhti ◽

Arko Lucieer ◽

Asim Anees ◽

Brett Bryan

Keyword(s):

Image Classification ◽

Hyperspectral Image ◽

Mean Shift ◽

Training Sample ◽

Uncertainty Assessment ◽

Hyperspectral Image Classification ◽

Rule Based ◽

Classification Framework ◽

Rule Sets ◽

Ensemble Algorithms

This paper assesses the performance of DoTRules—a dictionary of trusted rules—as a supervised rule-based ensemble framework based on the mean-shift segmentation for hyperspectral image classification. The proposed ensemble framework consists of multiple rule sets with rules constructed based on different class frequencies and sequences of occurrences. Shannon entropy was derived for assessing the uncertainty of every rule and the subsequent filtering of unreliable rules. DoTRules is not only a transparent approach for image classification but also a tool to map rule uncertainty, where rule uncertainty assessment can be applied as an estimate of classification accuracy prior to image classification. In this research, the proposed image classification framework is implemented using three world reference hyperspectral image datasets. We found that the overall accuracy of classification using the proposed ensemble framework was superior to state-of-the-art ensemble algorithms, as well as two non-ensemble algorithms, at multiple training sample sizes. We believe DoTRules can be applied more generally to the classification of discrete data such as hyperspectral satellite imagery products.

Download Full-text