Detecting Recovery Problems Just in Time: Application of Automated Linguistic Analysis and Supervised Machine Learning to an Online Substance Abuse Forum (Preprint)

Mapping Intimacies ◽

10.2196/preprints.10136 ◽

2018 ◽

Author(s):

Rachel Kornfield ◽

Prathusha K Sarma ◽

Dhavan V Shah ◽

Fiona McTavish ◽

Gina Landucci ◽

...

Keyword(s):

Machine Learning ◽

Decision Trees ◽

Real Time ◽

Language Use ◽

Discussion Forum ◽

Supervised Machine Learning ◽

Bag Of Words ◽

Word Count ◽

Trained Staff ◽

Linguistic Inquiry

BACKGROUND Online discussion forums allow those in addiction recovery to seek help through text-based messages, including when facing triggers to drink or use drugs. Trained staff (or “moderators”) may participate within these forums to offer guidance and support when participants are struggling but must expend considerable effort to continually review new content. Demands on moderators limit the scalability of evidence-based digital health interventions. OBJECTIVE Automated identification of recovery problems could allow moderators to engage in more timely and efficient ways with participants who are struggling. This paper aimed to investigate whether computational linguistics and supervised machine learning can be applied to successfully flag, in real time, those discussion forum messages that moderators find most concerning. METHODS Training data came from a trial of a mobile phone-based health intervention for individuals in recovery from alcohol use disorder, with human coders labeling discussion forum messages according to whether or not authors mentioned problems in their recovery process. Linguistic features of these messages were extracted via several computational techniques: (1) a Bag-of-Words approach, (2) the dictionary-based Linguistic Inquiry and Word Count program, and (3) a hybrid approach combining the most important features from both Bag-of-Words and Linguistic Inquiry and Word Count. These features were applied within binary classifiers leveraging several methods of supervised machine learning: support vector machines, decision trees, and boosted decision trees. Classifiers were evaluated in data from a later deployment of the recovery support intervention. RESULTS To distinguish recovery problem disclosures, the Bag-of-Words approach relied on domain-specific language, including words explicitly linked to substance use and mental health (“drink,” “relapse,” “depression,” and so on), whereas the Linguistic Inquiry and Word Count approach relied on language characteristics such as tone, affect, insight, and presence of quantifiers and time references, as well as pronouns. A boosted decision tree classifier, utilizing features from both Bag-of-Words and Linguistic Inquiry and Word Count performed best in identifying problems disclosed within the discussion forum, achieving 88% sensitivity and 82% specificity in a separate cohort of patients in recovery. CONCLUSIONS Differences in language use can distinguish messages disclosing recovery problems from other message types. Incorporating machine learning models based on language use allows real-time flagging of concerning content such that trained staff may engage more efficiently and focus their attention on time-sensitive issues.

Download Full-text

A Comparative Study of Supervised Machine Learning Techniques for Deceptive Review Identification Using Linguistic Inquiry and Word Count

Advances in Intelligent Systems and Computing - Computational Intelligence in Information Systems ◽

10.1007/978-3-030-68133-3_10 ◽

2021 ◽

pp. 97-105

Author(s):

Dinooshi Poornima Jayathunga ◽

R. M. Iranthi Shashikala Ranasinghe ◽

Ramashini Murugiah

Keyword(s):

Machine Learning ◽

Comparative Study ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Word Count ◽

Learning Techniques ◽

Linguistic Inquiry

Download Full-text

Performance Improvement of Decision Tree: A Robust Classifier Using Tabu Search Algorithm

Applied Sciences ◽

10.3390/app11156728 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6728

Author(s):

Muhammad Asfand Hafeez ◽

Muhammad Rashid ◽

Hassan Tariq ◽

Zain Ul Abideen ◽

Saud S. Alotaibi ◽

...

Keyword(s):

Machine Learning ◽

Tabu Search ◽

Decision Tree ◽

Decision Trees ◽

Search Algorithm ◽

Learning Algorithms ◽

Performance Comparison ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Tabu Search Algorithm

Classification and regression are the major applications of machine learning algorithms which are widely used to solve problems in numerous domains of engineering and computer science. Different classifiers based on the optimization of the decision tree have been proposed, however, it is still evolving over time. This paper presents a novel and robust classifier based on a decision tree and tabu search algorithms, respectively. In the aim of improving performance, our proposed algorithm constructs multiple decision trees while employing a tabu search algorithm to consistently monitor the leaf and decision nodes in the corresponding decision trees. Additionally, the used tabu search algorithm is responsible to balance the entropy of the corresponding decision trees. For training the model, we used the clinical data of COVID-19 patients to predict whether a patient is suffering. The experimental results were obtained using our proposed classifier based on the built-in sci-kit learn library in Python. The extensive analysis for the performance comparison was presented using Big O and statistical analysis for conventional supervised machine learning algorithms. Moreover, the performance comparison to optimized state-of-the-art classifiers is also presented. The achieved accuracy of 98%, the required execution time of 55.6 ms and the area under receiver operating characteristic (AUROC) for proposed method of 0.95 reveals that the proposed classifier algorithm is convenient for large datasets.

Download Full-text

Classification of Military Aircraft in Real-time Radar Systems based on Supervised Machine Learning with Labelled ADS-B Data

2018 Sensor Data Fusion: Trends, Solutions, Applications (SDF) ◽

10.1109/sdf.2018.8547077 ◽

2018 ◽

Cited By ~ 3

Author(s):

Kaeye Dastner ◽

Susie Brunessaux ◽

Elke Schmid ◽

Bastian von Hasler zu Roseneckh-Kohler ◽

Felix Opitz

Keyword(s):

Machine Learning ◽

Real Time ◽

Supervised Machine Learning ◽

Military Aircraft ◽

Radar Systems

Download Full-text

Cloud-Based ROP Prediction and Optimization in Real-Time Using Supervised Machine Learning

Proceedings of the 7th Unconventional Resources Technology Conference ◽

10.15530/urtec-2019-343 ◽

2019 ◽

Cited By ~ 2

Author(s):

Kriti Singh ◽

Sai Sharan Yalamarty ◽

Mohammadreza Kamyab ◽

Curtis Cheatham

Keyword(s):

Machine Learning ◽

Real Time ◽

Supervised Machine Learning

Download Full-text

Analysis of Decision Tree Induction Algorithms

Research Society and Development ◽

10.33448/rsd-v8i11.1473 ◽

2019 ◽

Vol 8 (11) ◽

pp. e298111473

Author(s):

Hugo Kenji Rodrigues Okada ◽

Andre Ricardo Nascimento das Neves ◽

Ricardo Shitsuka

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Decision Trees ◽

Quantitative Study ◽

Data Structures ◽

Execution Time ◽

Supervised Machine Learning ◽

Decision Tree Induction ◽

Classification And Regression ◽

Cart Algorithm

Decision trees are data structures or computational methods that enable nonparametric supervised machine learning and are used in classification and regression tasks. The aim of this paper is to present a comparison between the decision tree induction algorithms C4.5 and CART. A quantitative study is performed in which the two methods are compared by analyzing the following aspects: operation and complexity. The experiments presented practically equal hit percentages in the execution time for tree induction, however, the CART algorithm was approximately 46.24% slower than C4.5 and was considered to be more effective.

Download Full-text

2107. Decision Trees vs. Neural Networks for Supervised Machine Learning-Based Prediction of Healthcare-Associated Urinary Tract Infections

Open Forum Infectious Diseases ◽

10.1093/ofid/ofy210.1763 ◽

2018 ◽

Vol 5 (suppl_1) ◽

pp. S618-S618

Author(s):

Philip Zachariah ◽

Elioth Mirsha Sanabria Buenaventura ◽

Jianfang Liu ◽

Bevin Cohen ◽

David Yao ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Urinary Tract ◽

Decision Trees ◽

Urinary Tract Infections ◽

Supervised Machine Learning ◽

Tract Infections ◽

Healthcare Associated

Download Full-text

Earthquake Detection in a Static and Dynamic Environment Using Supervised Machine Learning and a Novel Feature Extraction Method

Sensors ◽

10.3390/s20030800 ◽

2020 ◽

Vol 20 (3) ◽

pp. 800 ◽

Cited By ~ 2

Author(s):

Irshad Khan ◽

Seonhwa Choi ◽

Young-Woo Kwon

Keyword(s):

Machine Learning ◽

Real Time ◽

Dynamic Environment ◽

Supervised Machine Learning ◽

Experimental Result ◽

Feature Extraction Method ◽

Proposed Model ◽

Unseen Data ◽

Iot Devices ◽

Hard Real Time

Detecting earthquakes using smartphones or IoT devices in real-time is an arduous and challenging task, not only because it is constrained with the hard real-time issue but also due to the similarity of earthquake signals and the non-earthquake signals (i.e., noise or other activities). Moreover, the variety of human activities also makes it more difficult when a smartphone is used as an earthquake detecting sensor. To that end, in this article, we leverage a machine learning technique with earthquake features rather than traditional seismic methods. First, we split the detection task into two categories including static environment and dynamic environment. Then, we experimentally evaluate different features and propose the most appropriate machine learning model and features for the static environment to tackle the issue of noisy components and detect earthquakes in real-time with less false alarm rates. The experimental result of the proposed model shows promising results not only on the given dataset but also on the unseen data pointing to the generalization characteristics of the model. Finally, we demonstrate that the proposed model can be also used in the dynamic environment if it is trained with different dataset.

Download Full-text

Real-time prediction of inpatient length of stay for discharge prioritization

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocv106 ◽

2015 ◽

Vol 23 (e1) ◽

pp. e2-e10 ◽

Cited By ~ 29

Author(s):

Sean Barnes ◽

Eric Hamrock ◽

Matthew Toerper ◽

Sauleh Siddiqui ◽

Scott Levin

Keyword(s):

Machine Learning ◽

Real Time ◽

Health Information ◽

Patient Flow ◽

Supervised Machine Learning ◽

Youden Index ◽

Learning Methods ◽

Machine Learning Methods ◽

High Resource ◽

Sensitivity Specificity

Abstract Objective Hospitals are challenged to provide timely patient care while maintaining high resource utilization. This has prompted hospital initiatives to increase patient flow and minimize nonvalue added care time. Real-time demand capacity management (RTDC) is one such initiative whereby clinicians convene each morning to predict patients able to leave the same day and prioritize their remaining tasks for early discharge. Our objective is to automate and improve these discharge predictions by applying supervised machine learning methods to readily available health information. Materials and Methods The authors use supervised machine learning methods to predict patients’ likelihood of discharge by 2 p.m. and by midnight each day for an inpatient medical unit. Using data collected over 8000 patient stays and 20 000 patient days, the predictive performance of the model is compared to clinicians using sensitivity, specificity, Youden’s Index (i.e., sensitivity + specificity – 1), and aggregate accuracy measures. Results The model compared to clinician predictions demonstrated significantly higher sensitivity ( P < .01), lower specificity ( P < .01), and a comparable Youden Index ( P > .10). Early discharges were less predictable than midnight discharges. The model was more accurate than clinicians in predicting the total number of daily discharges and capable of ranking patients closest to future discharge. Conclusions There is potential to use readily available health information to predict daily patient discharges with accuracies comparable to clinician predictions. This approach may be used to automate and support daily RTDC predictions aimed at improving patient flow.

Download Full-text

A tale of four cities: A semantic analysis comparing the newspaper coverage of air pollution in Hong Kong, London, Pittsburgh, and Tianjin from 2014 to 2017

Newspaper Research Journal ◽

10.1177/0739532919873438 ◽

2019 ◽

Vol 41 (1) ◽

pp. 37-52

Author(s):

Tongxin Sun ◽

Bu Zhong

Keyword(s):

Air Pollution ◽

Hong Kong ◽

Air Quality ◽

Real Time ◽

Semantic Analysis ◽

Newspaper Coverage ◽

Word Count ◽

The Public ◽

Computer Aided ◽

Linguistic Inquiry

A computer-aided semantic analysis (using Linguistic Inquiry and Word Count [LIWC]) examined how newspaper coverage of air pollution from 2014 to 2017 may affect the public agenda in four cities—Hong Kong, London, Pittsburgh, and Tianjin. Results show that after controlling for the real-time air quality, the agenda-setting effect was found in Hong Kong, London, and Pittsburgh, but not Tianjin. Tianjin’s reports also contained more future-framed words but fewer present-framed words than other cities.

Download Full-text

Recognizing Eruptions of Mount Etna through Machine Learning Using Multiperspective Infrared Images

Remote Sensing ◽

10.3390/rs12060970 ◽

2020 ◽

Vol 12 (6) ◽

pp. 970 ◽

Cited By ~ 3

Author(s):

Claudia Corradino ◽

Gaetana Ganci ◽

Annalisa Cappello ◽

Giuseppe Bilotta ◽

Sonia Calvari ◽

...

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Real Time ◽

Volcanic Eruptions ◽

Mount Etna ◽

Depth Information ◽

Bag Of Words ◽

Infrared Images ◽

Eruptive Activity ◽

Thermal Cameras

Detecting, locating and characterizing volcanic eruptions at an early stage provides the best means to plan and mitigate against potential hazards. Here, we present an automatic system which is able to recognize and classify the main types of eruptive activity occurring at Mount Etna by exploiting infrared images acquired using thermal cameras installed around the volcano. The system employs a machine learning approach based on a Decision Tree tool and a Bag of Words-based classifier. The Decision Tree provides information on the visibility level of the monitored area, while the Bag of Words-based classifier detects the onset of eruptive activity and recognizes the eruption type as either explosion and/or lava flow or plume degassing/ash. Applied in real-time to each image of each of the thermal cameras placed around Etna, the proposed system provides two outputs, namely, visibility level and recognized eruptive activity status. By merging these outcomes, the monitored phenomena can be fully described from different perspectives to acquire more in-depth information in real time and in an automatic way.

Download Full-text