Machine Learning for Predicting Adherence to Internet-Delivered Psychotherapy for Symptoms of Depression and Anxiety after Myocardial Infarction: Insights from the U-CARE Heart Trial (Preprint)

BACKGROUND BACKGROUND: Low adherence to recommended treatments is a multifactorial problem in rehabilitation for patients with myocardial infarction (MI). In a nationwide trial of internet-delivered cognitive behavior therapy (iCBT) for the high-risk subgroup of patients with MI also reporting symptoms of anxiety, depression, or both (MI-ANXDEP), adherence was low. Since low adherence to psychotherapy leads to a waste of therapeutic resources and risky treatment abortion in MI-ANXDEP patients, identifying early predictors for adherence is potentially valuable for effective targeted care. OBJECTIVE Applied predictive modelling with supervised machine learning to investigate both established and novel predictors for iCBT adherence in MI-ANXDEP patients. METHODS Data were from 90 MI-ANXDEP patients recruited from 25 hospitals in Sweden and randomized to treatment in the iCBT trial U-CARE Heart. Time-point of prediction was at completion of the first homework assignment. Adherence was defined as having completed at least the first two homework assignments within the 14-week treatment period. A supervised machine learning procedure was applied to identify the most potent predictors for adherence available at the first treatment session from a range of demographic, clinical, psychometric, and linguistic predictors. The internal binary classifier was a random forest model within a 3x10-fold cross-validated recursive feature elimination (RFE) resampling, which selected the final predictor subset which best differentiated adherers versus non-adherers. RESULTS RESULTS: Patient mean age was 58.4 (9.4) years, 62% (56/90) were men, and 48% (43/90) were adherent. Out of the 34 potential predictors for adherence, RFE selected an optimal subset of 56% (19/34) (Accuracy 0.64, 95% CI 0.61-0.68, P < 0.01). The strongest predictors for adherence were in order of importance (1) self-assessed cardiac-related fear, (2) sex, and (3) the number of words the patient used to answer the first homework assignment. CONCLUSIONS CONCLUSIONS: Adherence to iCBT for MI-ANXDEP patients was best predicted by cardiac-related fear and sex, consistent with previous research, but also by novel linguistic predictors from written patient behavior which conceivably indicate verbal ability or therapeutic alliance. Future research should investigate potential causal mechanisms, seek to determine what underlying constructs the linguistic predictors tap into, and whether these findings replicate for other interventions, outside of Sweden, in larger samples, and for patients with other conditions whom are offered iCBT. CLINICALTRIAL TRIAL REGISTRATION: ClinicalTrials.gov NCT01504191; https://clinicaltrials.gov/ct2/show/NCT01504191 (Archived at Webcite at http://www.webcitation.org/6xWWSEQ22)

Download Full-text

Predictive Modelling of Employee Turnover in Indian IT Industry Using Machine Learning Techniques

Vision The Journal of Business Perspective ◽

10.1177/0972262918821221 ◽

2019 ◽

Vol 23 (1) ◽

pp. 12-21 ◽

Cited By ~ 2

Author(s):

Shikha N. Khera ◽

Divya

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Confusion Matrix ◽

Predictive Modelling ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

It Industry ◽

Knowledge Based ◽

Employee Attrition

Information technology (IT) industry in India has been facing a systemic issue of high attrition in the past few years, resulting in monetary and knowledge-based loses to the companies. The aim of this research is to develop a model to predict employee attrition and provide the organizations opportunities to address any issue and improve retention. Predictive model was developed based on supervised machine learning algorithm, support vector machine (SVM). Archival employee data (consisting of 22 input features) were collected from Human Resource databases of three IT companies in India, including their employment status (response variable) at the time of collection. Accuracy results from the confusion matrix for the SVM model showed that the model has an accuracy of 85 per cent. Also, results show that the model performs better in predicting who will leave the firm as compared to predicting who will not leave the company.

Download Full-text

Nonprofit Role Classification Using Mission Descriptions and Supervised Machine Learning

Nonprofit and Voluntary Sector Quarterly ◽

10.1177/08997640211057393 ◽

2021 ◽

pp. 089976402110573

Author(s):

Megan LePere-Schloop

Keyword(s):

Machine Learning ◽

Geographic Variation ◽

Mission Statements ◽

Research Note ◽

Supervised Machine Learning ◽

Future Research ◽

Large Set ◽

Large Sample ◽

Qualitative Approaches

Scholars have used both quantitative and qualitative approaches to empirically study nonprofit roles. Mission statements and program descriptions often reflect such roles, however, until recently collecting and classifying a large sample has been labor-intensive. This research note uses data on United Ways that e-filed their 990 forms and supervised machine learning to illustrate an approach for classifying a large set of mission descriptions by roles. Temporal and geographic variation in roles detected in mission statements suggests that such an approach may be fruitful in future research.

Download Full-text

Support Vector Machine

Handbook of Research on Modern Systems Analysis and Design Technologies and Applications ◽

10.4018/978-1-59904-887-1.ch028 ◽

2009 ◽

pp. 501-522 ◽

Cited By ~ 1

Author(s):

A. B.M. Shawkat Ali

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

System Analysis ◽

Learning Algorithm ◽

Mathematical Formulation ◽

Supervised Machine Learning ◽

Future Research ◽

Support Vector ◽

Analysis And Design ◽

Simple Demonstration

From the beginning, machine learning methodology, which is the origin of artificial intelligence, has been rapidly spreading in the different research communities with successful outcomes. This chapter aims to introduce for system analysers and designers a comparatively new statistical supervised machine learning algorithm called support vector machine (SVM). We explain two useful areas of SVM, that is, classification and regression, with basic mathematical formulation and simple demonstration to make easy the understanding of SVM. Prospects and challenges of future research in this emerging area are also described. Future research of SVM will provide improved and quality access to the users. Therefore, developing an automated SVM system with state-of-the-art technologies is of paramount importance, and hence, this chapter will link up an important step in the system analysis and design perspective to this evolving research arena.

Download Full-text

Detection of Leek Rust Disease under Field Conditions Using Hyperspectral Proximal Sensing and Machine Learning

Remote Sensing ◽

10.3390/rs13071341 ◽

2021 ◽

Vol 13 (7) ◽

pp. 1341

Author(s):

Simon Appeltans ◽

Jan G. Pieters ◽

Abdul M. Mouazen

Keyword(s):

Machine Learning ◽

Field Conditions ◽

Environmental Cost ◽

Supervised Machine Learning ◽

Future Research ◽

Complex Data ◽

Rust Disease ◽

Proximal Sensing ◽

Full Field ◽

Extensive Evaluation

Rust disease is an important problem for leek cultivation worldwide. It reduces market value and in extreme cases destroys the entire harvest. Farmers have to resort to periodical full-field fungicide applications to prevent the spread of disease, once every 1 to 5 weeks, depending on the cultivar and weather conditions. This implies an economic cost for the farmer and an environmental cost for society. Hyperspectral sensors have been extensively used to address this issue in research, but their application in the field has been limited to a relatively low number of crops, excluding leek, due to the high investment costs and complex data gathering and analysis associated with these sensors. To fill this gap, a methodology was developed for detecting leek rust disease using hyperspectral proximal sensing data combined with supervised machine learning. First, a hyperspectral library was constructed containing 43,416 spectra with a waveband range of 400–1000 nm, measured under field conditions. Then, an extensive evaluation of 11 common classifiers was performed using the scikit-learn machine learning library in Python, combined with a variety of wavelength selection techniques and preprocessing strategies. The best performing model was a (linear) logistic regression model that was able to correctly classify rust disease with an accuracy of 98.14 %, using reflectance values at 556 and 661 nm, combined with the value of the first derivative at 511 nm. This model was used to classify unlabelled hyperspectral images, confirming that the model was able to accurately classify leek rust disease symptoms. It can be concluded that the results in this work are an important step towards the mapping of leek rust disease, and that future research is needed to overcome certain challenges before variable rate fungicide applications can be adopted against leek rust disease.

Download Full-text

Supervised Machine Learning for Predicting SMME Sales: An Evaluation of Three Algorithms

The African Journal of Information and Communication ◽

10.23962/10539/31371 ◽

2021 ◽

pp. 1-21

Author(s):

Helper Zhou ◽

Victor Gumbo

Keyword(s):

Machine Learning ◽

Predictive Analytics ◽

Predictive Modelling ◽

Sales Performance ◽

Ordinary Least Squares ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Selection Operator

The emergence of machine learning algorithms presents the opportunity for a variety of stakeholders to perform advanced predictive analytics and to make informed decisions. However, to date there have been few studies in developing countries that evaluate the performance of such algorithms—with the result that pertinent stakeholders lack an informed basis for selecting appropriate techniques for modelling tasks. This study aims to address this gap by evaluating the performance of three machine learning techniques: ordinary least squares (OLS), least absolute shrinkage and selection operator (LASSO), and artificial neural networks (ANNs). These techniques are evaluated in respect of their ability to perform predictive modelling of the sales performance of small, medium and micro enterprises (SMMEs) engaged in manufacturing. The evaluation finds that the ANNs algorithm’s performance is far superior to that of the other two techniques, OLS and LASSO, in predicting the SMMEs’ sales performance.

Download Full-text

Evolution of corporate reputation during an evolving controversy

Journal of Communication Management ◽

10.1108/jcom-08-2018-0072 ◽

2019 ◽

Vol 23 (1) ◽

pp. 52-71 ◽

Cited By ~ 3

Author(s):

Siyoung Chung ◽

Mark Chong ◽

Jie Sheng Chua ◽

Jin Cheon Na

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Corporate Reputation ◽

Supervised Machine Learning ◽

Future Research ◽

Content Type ◽

Twitter Users ◽

Corporate Crisis ◽

The Impact

PurposeThe purpose of this paper is to investigate the evolution of online sentiments toward a company (i.e. Chipotle) during a crisis, and the effects of corporate apology on those sentiments.Design/methodology/approachUsing a very large data set of tweets (i.e. over 2.6m) about Company A’s food poisoning case (2015–2016). This case was selected because it is widely known, drew attention from various stakeholders and had many dynamics (e.g. multiple outbreaks, and across different locations). This study employed a supervised machine learning approach. Its sentiment polarity classification and relevance classification consisted of five steps: sampling, labeling, tokenization, augmentation of semantic representation, and the training of supervised classifiers for relevance and sentiment prediction.FindingsThe findings show that: the overall sentiment of tweets specific to the crisis was neutral; promotions and marketing communication may not be effective in converting negative sentiments to positive sentiments; a corporate crisis drew public attention and sparked public discussion on social media; while corporate apologies had a positive effect on sentiments, the effect did not last long, as the apologies did not remove public concerns about food safety; and some Twitter users exerted a significant influence on online sentiments through their popular tweets, which were heavily retweeted among Twitter users.Research limitations/implicationsEven with multiple training sessions and the use of a voting procedure (i.e. when there was a discrepancy in the coding of a tweet), there were some tweets that could not be accurately coded for sentiment. Aspect-based sentiment analysis and deep learning algorithms can be used to address this limitation in future research. This analysis of the impact of Chipotle’s apologies on sentiment did not test for a direct relationship. Future research could use manual coding to include only specific responses to the corporate apology. There was a delay between the time social media users received the news and the time they responded to it. Time delay poses a challenge to the sentiment analysis of Twitter data, as it is difficult to interpret which peak corresponds with which incident/s. This study focused solely on Twitter, which is just one of several social media sites that had content about the crisis.Practical implicationsFirst, companies should use social media as official corporate news channels and frequently update them with any developments about the crisis, and use them proactively. Second, companies in crisis should refrain from marketing efforts. Instead, they should focus on resolving the issue at hand and not attempt to regain a favorable relationship with stakeholders right away. Third, companies can leverage video, images and humor, as well as individuals with large online social networks to increase the reach and diffusion of their messages.Originality/valueThis study is among the first to empirically investigate the dynamics of corporate reputation as it evolves during a crisis as well as the effects of corporate apology on online sentiments. It is also one of the few studies that employs sentiment analysis using a supervised machine learning method in the area of corporate reputation and communication management. In addition, it offers valuable insights to both researchers and practitioners who wish to utilize big data to understand the online perceptions and behaviors of stakeholders during a corporate crisis.

Download Full-text

A machine learning model for the prediction of survival and tumor subtype in pancreatic ductal adenocarcinoma from preoperative diffusion-weighted imaging

European Radiology Experimental ◽

10.1186/s41747-019-0119-0 ◽

2019 ◽

Vol 3 (1) ◽

Cited By ~ 13

Author(s):

Georgios Kaissis ◽

Sebastian Ziegelmayer ◽

Fabian Lohöfer ◽

Hana Algül ◽

Matthias Eiber ◽

...

Keyword(s):

Machine Learning ◽

Pancreatic Ductal Adenocarcinoma ◽

Diffusion Weighted Imaging ◽

Area Under The Curve ◽

Supervised Machine Learning ◽

Ductal Adenocarcinoma ◽

Recursive Feature Elimination ◽

Training Cohort ◽

Diffusion Weighted ◽

Histopathological Subtype

Abstract Background To develop a supervised machine learning (ML) algorithm predicting above- versus below-median overall survival (OS) from diffusion-weighted imaging-derived radiomic features in patients with pancreatic ductal adenocarcinoma (PDAC). Methods One hundred two patients with histopathologically proven PDAC were retrospectively assessed as training cohort, and 30 prospectively accrued and retrospectively enrolled patients served as independent validation cohort (IVC). Tumors were segmented on preoperative apparent diffusion coefficient (ADC) maps, and radiomic features were extracted. A random forest ML algorithm was fit to the training cohort and tested in the IVC. Histopathological subtype of tumor samples was assessed by immunohistochemistry in 21 IVC patients. Individual radiomic feature importance was evaluated by assessment of tree node Gini impurity decrease and recursive feature elimination. Fisher’s exact test, 95% confidence intervals (CI), and receiver operating characteristic area under the curve (ROC-AUC) were used. Results The ML algorithm achieved 87% sensitivity (95% IC 67.3–92.7), 80% specificity (95% CI 74.0–86.7), and ROC-AUC 90% for the prediction of above- versus below-median OS in the IVC. Heterogeneity-related features were highly ranked by the model. Of the 21 patients with determined histopathological subtype, 8/9 patients predicted to experience below-median OS exhibited the quasi-mesenchymal subtype, whilst 11/12 patients predicted to experience above-median OS exhibited a non-quasi-mesenchymal subtype (p < 0.001). Conclusion ML application to ADC radiomics allowed OS prediction with a high diagnostic accuracy in an IVC. The high overlap of clinically relevant histopathological subtypes with model predictions underlines the potential of quantitative imaging in PDAC pre-operative subtyping and prognosis.

Download Full-text

Predicting remission after internet-delivered psychotherapy in patients with depression using machine learning and multi-modal data

10.1101/2021.04.30.21256367 ◽

2021 ◽

Author(s):

John Wallert ◽

Julia Boberg ◽

Viktor Kaldo ◽

David Mataix-Cols ◽

Oskar Flygare ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Random Forest Model ◽

Supervised Machine Learning ◽

Risk Scores ◽

Recursive Feature Elimination ◽

Validation Data ◽

Forest Model ◽

Remission Status ◽

Cognitive Behaviour

BACKGROUND: Whether a patient benefits from psychotherapy or not is arguably a complex process and heterogeneous information extracted from process, genetic, demographic, and clinical data could contribute to the prediction of remission status after psychotherapy. This study applied supervised machine learning with such multi-modal baseline data to predict remission in patients with major depressive disorder (MDD) after completed psychotherapy. METHODS: Eight-hundred ninety-four genotyped adult patients (65.5% women, age range 18-75 years) diagnosed with MDD and treated with guided Internet-based Cognitive Behaviour Therapy (ICBT) at the Internet Psychiatry Clinic in Stockholm were included (2008-2016). Predictor variables from multiple domains were available: demographic, clinical, process (e.g. time to complete online questionnaires), and genetic (polygenic risk scores for depression, education and more). The outcome was remission status post ICBT (cut-off ≤10 on MADRS-S). Data were split into train (60%) and validation (40%) sets based on treatment start date. Predictor selection employed human domain knowledge followed by Recursive Feature Elimination. Model derivation was internally validated through repeated cross-validation resampling. The final random forest model was externally validated against a (i) null, (ii) logit, (iii) XGBoost, and (iv) blended meta-ensemble model on the hold-out validation set. Model transparency was explored through partial dependence and Local Interpretable Model-agnostic Explanations (LIME) analysis. RESULTS: Feature selection retained 45 predictors representing all four predictor types. With unseen validation data, the final random forest model proved reasonably accurate at classifying post ICBT remission (Accuracy 0.656 [0.604, 0.705], P vs null model = 0.004; AUC 0.687 [0.631, 0.743]), slightly better vs logit (bootstrap D =1.730, P = 0.084) but not vs XGBoost (D = 0.463, P = 0.643). Transparency analysis showed model usage of all predictor types at both the group and individual patient level. CONCLUSION: A new, multi-modal classifier for predicting MDD remission status after ICBT treatment in routine psychiatric care was derived and empirically validated. The multi-modal approach to predicting remission may inform tailored treatment, and deserves further investigation.

Download Full-text

A supervised machine-learning approach towards geochemical predictive modelling in archaeology

Journal of Archaeological Science ◽

10.1016/j.jas.2015.04.002 ◽

2015 ◽

Vol 59 ◽

pp. 80-88 ◽

Cited By ~ 16

Author(s):

Stijn Oonk ◽

Job Spijker

Keyword(s):

Machine Learning ◽

Predictive Modelling ◽

Supervised Machine Learning ◽

Learning Approach ◽

Machine Learning Approach

Download Full-text

DASentimental: Detecting Depression, Anxiety, and Stress in Texts via Emotional Recall, Cognitive Networks, and Machine Learning

Big Data and Cognitive Computing ◽

10.3390/bdcc5040077 ◽

2021 ◽

Vol 5 (4) ◽

pp. 77

Author(s):

Asra Fatima ◽

Ying Li ◽

Thomas Trenholm Hills ◽

Massimo Stella

Keyword(s):

Machine Learning ◽

Semantic Memory ◽

Semantic Network ◽

Cognitive Networks ◽

Supervised Machine Learning ◽

Future Research ◽

Circumplex Model ◽

Depression Anxiety Stress Scale ◽

Written Text ◽

Semantic Distances

Most current affect scales and sentiment analysis on written text focus on quantifying valence/sentiment, the primary dimension of emotion. Distinguishing broader, more complex negative emotions of similar valence is key to evaluating mental health. We propose a semi-supervised machine learning model, DASentimental, to extract depression, anxiety, and stress from written text. We trained DASentimental to identify how N = 200 sequences of recalled emotional words correlate with recallers’ depression, anxiety, and stress from the Depression Anxiety Stress Scale (DASS-21). Using cognitive network science, we modeled every recall list as a bag-of-words (BOW) vector and as a walk over a network representation of semantic memory—in this case, free associations. This weights BOW entries according to their centrality (degree) in semantic memory and informs recalls using semantic network distances, thus embedding recalls in a cognitive representation. This embedding translated into state-of-the-art, cross-validated predictions for depression (R = 0.7), anxiety (R = 0.44), and stress (R = 0.52), equivalent to previous results employing additional human data. Powered by a multilayer perceptron neural network, DASentimental opens the door to probing the semantic organizations of emotional distress. We found that semantic distances between recalls (i.e., walk coverage), was key for estimating depression levels but redundant for anxiety and stress levels. Semantic distances from “fear” boosted anxiety predictions but were redundant when the “sad–happy” dyad was considered. We applied DASentimental to a clinical dataset of 142 suicide notes and found that the predicted depression and anxiety levels (high/low) corresponded to differences in valence and arousal as expected from a circumplex model of affect. We discuss key directions for future research enabled by artificial intelligence detecting stress, anxiety, and depression in texts.

Download Full-text