scholarly journals The effects of co-morbidity in defining major depression subtypes associated with long-term course and severity

2014 ◽  
Vol 44 (15) ◽  
pp. 3289-3302 ◽  
Author(s):  
K. J. Wardenaar ◽  
H. M. van Loo ◽  
T. Cai ◽  
M. Fava ◽  
M. J. Gruber ◽  
...  

Background.Although variation in the long-term course of major depressive disorder (MDD) is not strongly predicted by existing symptom subtype distinctions, recent research suggests that prediction can be improved by using machine learning methods. However, it is not known whether these distinctions can be refined by added information about co-morbid conditions. The current report presents results on this question.Method.Data came from 8261 respondents with lifetime DSM-IV MDD in the World Health Organization (WHO) World Mental Health (WMH) Surveys. Outcomes included four retrospectively reported measures of persistence/severity of course (years in episode; years in chronic episodes; hospitalization for MDD; disability due to MDD). Machine learning methods (regression tree analysis; lasso, ridge and elastic net penalized regression) followed by k-means cluster analysis were used to augment previously detected subtypes with information about prior co-morbidity to predict these outcomes.Results.Predicted values were strongly correlated across outcomes. Cluster analysis of predicted values found three clusters with consistently high, intermediate or low values. The high-risk cluster (32.4% of cases) accounted for 56.6–72.9% of high persistence, high chronicity, hospitalization and disability. This high-risk cluster had both higher sensitivity and likelihood ratio positive (LR+; relative proportions of cases in the high-risk cluster versus other clusters having the adverse outcomes) than in a parallel analysis that excluded measures of co-morbidity as predictors.Conclusions.Although the results using the retrospective data reported here suggest that useful MDD subtyping distinctions can be made with machine learning and clustering across multiple indicators of illness persistence/severity, replication with prospective data is needed to confirm this preliminary conclusion.

Author(s):  
Ben Tribelhorn ◽  
H. E. Dillon

Abstract This paper is a preliminary report on work done to explore the use of unsupervised machine learning methods to predict the onset of turbulent transitions in natural convection systems. The Lorenz system was chosen to test the machine learning methods due to the relative simplicity of the dynamic system. We developed a robust numerical solution to the Lorenz equations using a fourth order Runge-Kutta method with a time step of 0.001 seconds. We solved the Lorenz equations for a large range of Raleigh ratios from 1–1000 while keeping the geometry and Prandtl number constant. We calculated the spectral density, various descriptive statistics, and a cluster analysis using unsupervised machine learning. We examined the performance of the machine learning system for different Raleigh ratio ranges. We found that the automated cluster analysis aligns well with well known key transition regions of the convection system. We determined that considering smaller ranges of Raleigh ratios may improve the performance of the machine learning tools. We also identified possible additional behaviors not shown in z-axis bifurcation plots. This unsupervised learning approach can be leveraged on other systems where numerical analysis is computationally intractable or more difficult. The results are interesting and provide a foundation for expanding the study for Prandtl number and geometry variations. Future work will focus on applying the methods to more complex natural convection systems, including the development of new methods for Nusselt correlations.


Author(s):  
Hossein Sangrody ◽  
Ning Zhou ◽  
Salih Tutun ◽  
Benyamin Khorramdel ◽  
Mahdi Motalleb ◽  
...  

2020 ◽  
Vol 242 ◽  
pp. 05003
Author(s):  
A.E. Lovell ◽  
A.T. Mohan ◽  
P. Talou ◽  
M. Chertkov

As machine learning methods gain traction in the nuclear physics community, especially those methods that aim to propagate uncertainties to unmeasured quantities, it is important to understand how the uncertainty in the training data coming either from theory or experiment propagates to the uncertainty in the predicted values. Gaussian Processes and Bayesian Neural Networks are being more and more widely used, in particular to extrapolate beyond measured data. However, studies are typically not performed on the impact of the experimental errors on these extrapolated values. In this work, we focus on understanding how uncertainties propagate from input to prediction when using machine learning methods. We use a Mixture Density Network (MDN) to incorporate experimental error into the training of the network and construct uncertainties for the associated predicted quantities. Systematically, we study the effect of the size of the experimental error, both on the reproduced training data and extrapolated predictions for fission yields of actinides.


2020 ◽  
Author(s):  
Anjiao Peng ◽  
Xiaorong Yang ◽  
Zhining Wen ◽  
Wanling Li ◽  
Yusha Tang ◽  
...  

Abstract Background : Stroke is one of the most important causes of epilepsy and we aimed to find if it is possible to predict patients with high risk of developing post-stroke epilepsy (PSE) at the time of discharge using machine learning methods. Methods : Patients with stroke were enrolled and followed at least one year. Machine learning methods including support vector machine (SVM), random forest (RF) and logistic regression (LR) were used to learn the data. Results : A total of 2730 patients with cerebral infarction and 844 patients with cerebral hemorrhage were enrolled and the risk of PSE was 2.8% after cerebral infarction and 7.8% after cerebral hemorrhage in one year. Machine learning methods showed good performance in predicting PSE. The area under the receiver operating characteristic curve (AUC) for SVM and RF in predicting PSE after cerebral infarction was close to 1 and it was 0.92 for LR. When predicting PSE after cerebral hemorrhage, the performance of SVM was best with AUC being close to 1, followed by RF ( AUC = 0.99) and LR (AUC = 0.85) . Conclusion : Machine learning methods could be used to predict patients with high risk of developing PSE, which will help to stratify patients with high risk and start treatment earlier. Nevertheless, more work is needed before the application of thus intelligent predictive model in clinical practice.


2021 ◽  
Author(s):  
Yafei Wu ◽  
Zhongquan Jiang ◽  
Shaowu Lin ◽  
Ya Fang

Abstract Background: Prediction of stroke based on individuals’ risk factors, especially for a first stroke event, is of great significance for primary prevention of high-risk populations. Our study aimed to investigate the applicability of interpretable machine learning for predicting a 2-year stroke occurrence in older adults compared with logistic regression.Methods: A total of 5960 participants consecutively surveyed from July 2011 to August 2013 in the China Health and Retirement Longitudinal Study were included for analysis. We constructed a traditional logistic regression (LR) and two machine learning methods, namely random forest (RF) and extreme gradient boosting (XGBoost), to distinguish stroke occurrence versus non-stroke occurrence using data on demographics, lifestyle, disease history, and clinical variables. Grid search and 10-fold cross validation were used to tune the hyperparameters. Model performance was assessed by discrimination, calibration, decision curve and predictiveness curve analysis.Results: Among the 5960 participants, 131 (2.20%) of them developed stroke after an average of 2-year follow-up. Our prediction models distinguished stroke occurrence versus non-stroke occurrence with excellent performance. The AUCs of machine learning methods (RF, 0.823[95% CI, 0.759-0.886]; XGBoost, 0.808[95% CI, 0.730-0.886]) were significantly higher than LR (0.718[95% CI, 0.649, 0.787], p<0.05). No significant difference was observed between RF and XGBoost (p>0.05). All prediction models had good calibration results, and the brier score were 0.022 (95% CI, 0.015-0.028) in LR, 0.019 (95% CI, 0.014-0.025) in RF, and 0.020 (95% CI, 0.015-0.026) in XGBoost. XGBoost had much higher net benefits within a wider threshold range in terms of decision curve analysis, and more capable of recognizing high risk individuals in terms of predictiveness curve analysis. A total of eight predictors including gender, waist-to-height ratio, dyslipidemia, glycated hemoglobin, white blood cell count, blood glucose, triglycerides, and low-density lipoprotein cholesterol ranked top 5 in three prediction models.Conclusions: Machine learning methods, especially for XGBoost, had the potential to predict stroke occurrence compared with traditional logistic regression in the older adults.


2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Hui Yu ◽  
Jian Deng ◽  
Ran Nathan ◽  
Max Kröschel ◽  
Sasha Pekarsky ◽  
...  

Abstract Background Our understanding of movement patterns and behaviours of wildlife has advanced greatly through the use of improved tracking technologies, including application of accelerometry (ACC) across a wide range of taxa. However, most ACC studies either use intermittent sampling that hinders continuity or continuous data logging relying on tracker retrieval for data downloading which is not applicable for long term study. To allow long-term, fine-scale behavioural research, we evaluated a range of machine learning methods for their suitability for continuous on-board classification of ACC data into behaviour categories prior to data transmission. Methods We tested six supervised machine learning methods, including linear discriminant analysis (LDA), decision tree (DT), support vector machine (SVM), artificial neural network (ANN), random forest (RF) and extreme gradient boosting (XGBoost) to classify behaviour using ACC data from three bird species (white stork Ciconia ciconia, griffon vulture Gyps fulvus and common crane Grus grus) and two mammals (dairy cow Bos taurus and roe deer Capreolus capreolus). Results Using a range of quality criteria, SVM, ANN, RF and XGBoost performed well in determining behaviour from ACC data and their good performance appeared little affected when greatly reducing the number of input features for model training. On-board runtime and storage-requirement tests showed that notably ANN, RF and XGBoost would make suitable on-board classifiers. Conclusions Our identification of using feature reduction in combination with ANN, RF and XGBoost as suitable methods for on-board behavioural classification of continuous ACC data has considerable potential to benefit movement ecology and behavioural research, wildlife conservation and livestock husbandry.


Sign in / Sign up

Export Citation Format

Share Document