scholarly journals An Autoencoder and Machine Learning Model to Predict Suicidal Ideation with Brain Structural Imaging

2020 ◽  
Vol 9 (3) ◽  
pp. 658 ◽  
Author(s):  
Jun-Cheng Weng ◽  
Tung-Yeh Lin ◽  
Yuan-Hsiung Tsai ◽  
Man Teng Cheok ◽  
Yi-Peng Eve Chang ◽  
...  

It is estimated that at least one million people die by suicide every year, showing the importance of suicide prevention and detection. In this study, an autoencoder and machine learning model was employed to predict people with suicidal ideation based on their structural brain imaging. The subjects in our generalized q-sampling imaging (GQI) dataset consisted of three groups: 41 depressive patients with suicidal ideation (SI), 54 depressive patients without suicidal thoughts (NS), and 58 healthy controls (HC). In the GQI dataset, indices of generalized fractional anisotropy (GFA), isotropic values of the orientation distribution function (ISO), and normalized quantitative anisotropy (NQA) were separately trained in different machine learning models. A convolutional neural network (CNN)-based autoencoder model, the supervised machine learning algorithm extreme gradient boosting (XGB), and logistic regression (LR) were used to discriminate SI subjects from NS and HC subjects. After five-fold cross validation, separate data were tested to obtain the accuracy, sensitivity, specificity, and area under the curve of each result. Our results showed that the best pattern of structure across multiple brain locations can classify suicidal ideates from NS and HC with a prediction accuracy of 85%, a specificity of 100% and a sensitivity of 75%. The algorithms developed here might provide an objective tool to help identify suicidal ideation risk among depressed patients alongside clinical assessment.

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Chalachew Muluken Liyew ◽  
Haileyesus Amsaya Melese

AbstractPredicting the amount of daily rainfall improves agricultural productivity and secures food and water supply to keep citizens healthy. To predict rainfall, several types of research have been conducted using data mining and machine learning techniques of different countries’ environmental datasets. An erratic rainfall distribution in the country affects the agriculture on which the economy of the country depends on. Wise use of rainfall water should be planned and practiced in the country to minimize the problem of the drought and flood occurred in the country. The main objective of this study is to identify the relevant atmospheric features that cause rainfall and predict the intensity of daily rainfall using machine learning techniques. The Pearson correlation technique was used to select relevant environmental variables which were used as an input for the machine learning model. The dataset was collected from the local meteorological office at Bahir Dar City, Ethiopia to measure the performance of three machine learning techniques (Multivariate Linear Regression, Random Forest, and Extreme Gradient Boost). Root mean squared error and Mean absolute Error methods were used to measure the performance of the machine learning model. The result of the study revealed that the Extreme Gradient Boosting machine learning algorithm performed better than others.


2017 ◽  
Vol 35 (15_suppl) ◽  
pp. e15090-e15090
Author(s):  
Shin Yin Lee ◽  
Vijaya B. Kolachalama ◽  
Umit Tapan ◽  
Janice Weinberg ◽  
Jean M. Francis ◽  
...  

e15090 Background: Aberrant hyperactive Wnt/ß-catenin signaling is critical in colorectal cancer (CRC) tumorigenesis. Casitas B-lineage Lymphoma (c-Cbl) is a negative regulator of Wnt signaling, and functions as a tumor suppressor. The objective of this study was to evaluate c-Cbl expression as a predictive marker of survival in patients with metastatic CRC (mCRC). Methods: Patients with mCRC treated at Boston University Medical Center between 2004 and 2014 were analyzed. c-Cbl and nuclear ß-catenin expression was quantified in explanted biopsies using a customized color-based image segmentation pipeline. Quantification was normalized to the total tumor area in an image, and deemed ‘low’ or ‘high’ according to the mean normalized values of the cohort. A supervised machine-learning model based on bootstrap aggregating was constructed with c-Cbl expression as the input feature and 3-year survival as output. Results: Of the 72 subjects with mCRC, 52.78% had high and 47.22% had low c-Cbl expression. Patients with high c-Cbl had significantly better median overall survival than those with low c-Cbl expression (3.7 years vs. 1.8 years; p = 0.0026), and experienced superior 3-year survival (47.37% vs 20.59%; p = 0.017). Intriguingly, nuclear ß-catenin expression did not correlate with survival. No significant differences were detected between high and low c-Cbl groups in baseline characteristics (demographics, comorbidities), tumor-related parameters (primary tumor location, number of metastasis, molecular features) or therapy received (surgery, chemotherapy regimen). A 5-fold cross-validated machine-learning model associated with 3-year survival demonstrated an area under the curve of 0.729, supporting c-Cbl expression as a predictor of mCRC survival. Conclusions: Our results show that c-Cbl expression is associated with and predicts mCRC survival. Demonstration of these findings despite the small cohort size underscores the power of quantitative histology and machine-learning application. While further work is needed to validate c-Cbl as a novel biomarker of mCRC survival, this study supports c-Cbl as a regulator of Wnt/ß-catenin signaling and a suppressor of other oncogenes in CRC tumorigenesis.


2021 ◽  
Author(s):  
Eric Sonny Mathew ◽  
Moussa Tembely ◽  
Waleed AlAmeri ◽  
Emad W. Al-Shalabi ◽  
Abdul Ravoof Shaik

Abstract A meticulous interpretation of steady-state or unsteady-state relative permeability (Kr) experimental data is required to determine a complete set of Kr curves. In this work, three different machine learning models was developed to assist in a faster estimation of these curves from steady-state drainage coreflooding experimental runs. The three different models that were tested and compared were extreme gradient boosting (XGB), deep neural network (DNN) and recurrent neural network (RNN) algorithms. Based on existing mathematical models, a leading edge framework was developed where a large database of Kr and Pc curves were generated. This database was used to perform thousands of coreflood simulation runs representing oil-water drainage steady-state experiments. The results obtained from these simulation runs, mainly pressure drop along with other conventional core analysis data, were utilized to estimate Kr curves based on Darcy's law. These analytically estimated Kr curves along with the previously generated Pc curves were fed as features into the machine learning model. The entire data set was split into 80% for training and 20% for testing. K-fold cross validation technique was applied to increase the model accuracy by splitting the 80% of the training data into 10 folds. In this manner, for each of the 10 experiments, 9 folds were used for training and the remaining one was used for model validation. Once the model is trained and validated, it was subjected to blind testing on the remaining 20% of the data set. The machine learning model learns to capture fluid flow behavior inside the core from the training dataset. The trained/tested model was thereby employed to estimate Kr curves based on available experimental results. The performance of the developed model was assessed using the values of the coefficient of determination (R2) along with the loss calculated during training/validation of the model. The respective cross plots along with comparisons of ground-truth versus AI predicted curves indicate that the model is capable of making accurate predictions with error percentage between 0.2 and 0.6% on history matching experimental data for all the three tested ML techniques (XGB, DNN, and RNN). This implies that the AI-based model exhibits better efficiency and reliability in determining Kr curves when compared to conventional methods. The results also include a comparison between classical machine learning approaches, shallow and deep neural networks in terms of accuracy in predicting the final Kr curves. The various models discussed in this research work currently focusses on the prediction of Kr curves for drainage steady-state experiments; however, the work can be extended to capture the imbibition cycle as well.


2020 ◽  
pp. postgradmedj-2020-138899
Author(s):  
Yiftach Barash ◽  
Shelly Soffer ◽  
Ehud Grossman ◽  
Noam Tau ◽  
Vera Sorin ◽  
...  

ObjectivesPhysicians continuously make tough decisions when discharging patients. Alerting on poor outcomes may help in this decision. This study evaluates a machine learning model for predicting 30-day mortality in emergency department (ED) discharged patients.MethodsWe retrospectively analysed visits of adult patients discharged from a single ED (1/2014–12/2018). Data included demographics, evaluation and treatment in the ED, and discharge diagnosis. The data comprised of both structured and free-text fields. A gradient boosting model was trained to predict mortality within 30 days of release from the ED. The model was trained on data from the years 2014–2017 and validated on data from the year 2018. In order to reduce potential end-of-life bias, a subgroup analysis was performed for non-oncological patients.ResultsOverall, 363 635 ED visits of discharged patients were analysed. The 30-day mortality rate was 0.8%. A majority of the mortality cases (65.3%) had a known oncological disease. The model yielded an area under the curve (AUC) of 0.97 (95% CI 0.96 to 0.97) for predicting 30-day mortality. For a sensitivity of 84% (95% CI 0.81 to 0.86), this model had a false positive rate of 1:20. For patients without a known malignancy, the model yielded an AUC of 0.94 (95% CI 0.92 to 0.95).ConclusionsAlthough not frequent, patients may die following ED discharge. Machine learning-based tools may help ED physicians identify patients at risk. An optimised decision for hospitalisation or palliative management may improve patient care and system resource allocation.


2021 ◽  
Author(s):  
Íris Viana dos Santos Santana ◽  
Andressa C. M. da Silveira ◽  
Álvaro Sobrinho ◽  
Lenardo Chaves e Silva ◽  
Leandro Dias da Silva ◽  
...  

BACKGROUND controlling the COVID-19 outbreak in Brazil is considered a challenge of continental proportions due to the high population and urban density, weak implementation and maintenance of social distancing strategies, and limited testing capabilities. OBJECTIVE to contribute to addressing such a challenge, we present the implementation and evaluation of supervised Machine Learning (ML) models to assist the COVID-19 detection in Brazil based on early-stage symptoms. METHODS firstly, we conducted data preprocessing and applied the Chi-squared test in a Brazilian dataset, mainly composed of early-stage symptoms, to perform statistical analyses. Afterward, we implemented ML models using the Random Forest (RF), Support Vector Machine (SVM), Multilayer Perceptron (MLP), K-Nearest Neighbors (KNN), Decision Tree (DT), Gradient Boosting Machine (GBM), and Extreme Gradient Boosting (XGBoost) algorithms. We evaluated the ML models using precision, accuracy score, recall, the area under the curve, and the Friedman and Nemenyi tests. Based on the comparison, we grouped the top five ML models and measured feature importance. RESULTS the MLP model presented the highest mean accuracy score, with more than 97.85%, when compared to GBM (> 97.39%), RF (> 97.36%), DT (> 97.07%), XGBoost (> 97.06%), KNN (> 95.14%), and SVM (> 94.27%). Based on the statistical comparison, we grouped MLP, GBM, DT, RF, and XGBoost, as the top five ML models, because the evaluation results are statistically indistinguishable. The ML models` importance of features used during predictions varies from gender, profession, fever, sore throat, dyspnea, olfactory disorder, cough, runny nose, taste disorder, and headache. CONCLUSIONS supervised ML models effectively assist the decision making in medical diagnosis and public administration (e.g., testing strategies), based on early-stage symptoms that do not require advanced and expensive exams.


Diagnostics ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 82
Author(s):  
Chun-Chuan Hsu ◽  
Cheng-CJ Chu ◽  
Ching-Heng Lin ◽  
Chien-Hsiung Huang ◽  
Chip-Jin Ng ◽  
...  

Seventy-two-hour unscheduled return visits (URVs) by emergency department patients are a key clinical index for evaluating the quality of care in emergency departments (EDs). This study aimed to develop a machine learning model to predict 72 h URVs for ED patients with abdominal pain. Electronic health records data were collected from the Chang Gung Research Database (CGRD) for 25,151 ED visits by patients with abdominal pain and a total of 617 features were used for analysis. We used supervised machine learning models, namely logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGB), and voting classifier (VC), to predict URVs. The VC model achieved more favorable overall performance than other models (AUROC: 0.74; 95% confidence interval (CI), 0.69–0.76; sensitivity, 0.39; specificity, 0.89; F1 score, 0.25). The reduced VC model achieved comparable performance (AUROC: 0.72; 95% CI, 0.69–0.74) to the full models using all clinical features. The VC model exhibited the most favorable performance in predicting 72 h URVs for patients with abdominal pain, both for all-features and reduced-features models. Application of the VC model in the clinical setting after validation may help physicians to make accurate decisions and decrease URVs.


2021 ◽  
Author(s):  
Domingos Andrade ◽  
Luigi Ribeiro ◽  
Agnaldo Lopes ◽  
Jorge Amaral ◽  
Pedro Lopes de Melo

Abstract BackgroundThe use of machine learning (ML) methods would improve the diagnosis of respiratory changes in systemic sclerosis (SSc). This paper evaluates the performance of several ML algorithms associated with the respiratory oscillometry analysis to aid in the diagnostic of respiratory changes in SSc. We also find out the best configuration for this task.MethodsOscillometric and spirometric exams were performed in 82 individuals, including controls (n=30), and patients with systemic sclerosis with normal (n=22) and abnormal (n=30) spirometry. Multiple instance classifiers and different supervised machine learning techniques were investigated, including k-nearest neighbours (KNN), random forests (RF), AdaBoost with decision trees (ADAB), and Extreme Gradient Boosting (XGB).Results and discussionThe first experiment of this study showed that the best oscillometric parameter (BOP) was dynamic compliance. In the scenario Control Group versus Patients with Sclerosis and normal spirometry (CGvsPSNS), it provided moderate accuracy (AUC=0.77). In the scenario Control Group versus Patients with Sclerosis and Altered spirometry (CGvsPSAS), the BOP obtained high accuracy (AUC=0.94). In the second experiment, the ML techniques were used. In CGvsPSNS, KNN achieved the best result (AUC=0.90), significantly improving the accuracy in comparison with the BOP (p<0.01), while in CGvsPSAS, RF obtained the best results (AUC=0.97), also significantly improving the diagnostic accuracy (p<0.05). In the third, fourth, fifth, and sixth experiments, the use of different feature selection techniques allowed us to spot the best oscillometric parameters. They all show a small increase in diagnostic accuracy in CGvsPSNS, respectively 0.87,0.86, 0.82, 0.84, while in the CGvsPSAS, the performance of the best classifier remained the same (AUC=0.97). ConclusionsOscillometric principles combined with machine learning algorithm provides a new method for the diagnosis of respiratory changes in patients with systemic sclerosis. The findings of the present study provide evidence that this combination may play an important role in the early diagnosis of respiratory changes in these patients.


2021 ◽  
Vol 11 (11) ◽  
pp. 1055
Author(s):  
Pei-Chen Lin ◽  
Kuo-Tai Chen ◽  
Huan-Chieh Chen ◽  
Md. Mohaimenul Islam ◽  
Ming-Chin Lin

Accurate stratification of sepsis can effectively guide the triage of patient care and shared decision making in the emergency department (ED). However, previous research on sepsis identification models focused mainly on ICU patients, and discrepancies in model performance between the development and external validation datasets are rarely evaluated. The aim of our study was to develop and externally validate a machine learning model to stratify sepsis patients in the ED. We retrospectively collected clinical data from two geographically separate institutes that provided a different level of care at different time periods. The Sepsis-3 criteria were used as the reference standard in both datasets for identifying true sepsis cases. An eXtreme Gradient Boosting (XGBoost) algorithm was developed to stratify sepsis patients and the performance of the model was compared with traditional clinical sepsis tools; quick Sequential Organ Failure Assessment (qSOFA) and Systemic Inflammatory Response Syndrome (SIRS). There were 8296 patients (1752 (21%) being septic) in the development and 1744 patients (506 (29%) being septic) in the external validation datasets. The mortality of septic patients in the development and validation datasets was 13.5% and 17%, respectively. In the internal validation, XGBoost achieved an area under the receiver operating characteristic curve (AUROC) of 0.86, exceeding SIRS (0.68) and qSOFA (0.56). The performance of XGBoost deteriorated in the external validation (the AUROC of XGBoost, SIRS and qSOFA was 0.75, 0.57 and 0.66, respectively). Heterogeneity in patient characteristics, such as sepsis prevalence, severity, age, comorbidity and infection focus, could reduce model performance. Our model showed good discriminative capabilities for the identification of sepsis patients and outperformed the existing sepsis identification tools. Implementation of the ML model in the ED can facilitate timely sepsis identification and treatment. However, dataset discrepancies should be carefully evaluated before implementing the ML approach in clinical practice. This finding reinforces the necessity for future studies to perform external validation to ensure the generalisability of any developed ML approaches.


Hypertension ◽  
2021 ◽  
Vol 78 (5) ◽  
pp. 1595-1604
Author(s):  
Fabrizio Buffolo ◽  
Jacopo Burrello ◽  
Alessio Burrello ◽  
Daniel Heinrich ◽  
Christian Adolf ◽  
...  

Primary aldosteronism (PA) is the cause of arterial hypertension in 4% to 6% of patients, and 30% of patients with PA are affected by unilateral and surgically curable forms. Current guidelines recommend screening for PA ≈50% of patients with hypertension on the basis of individual factors, while some experts suggest screening all patients with hypertension. To define the risk of PA and tailor the diagnostic workup to the individual risk of each patient, we developed a conventional scoring system and supervised machine learning algorithms using a retrospective cohort of 4059 patients with hypertension. On the basis of 6 widely available parameters, we developed a numerical score and 308 machine learning-based models, selecting the one with the highest diagnostic performance. After validation, we obtained high predictive performance with our score (optimized sensitivity of 90.7% for PA and 92.3% for unilateral PA [UPA]). The machine learning-based model provided the highest performance, with an area under the curve of 0.834 for PA and 0.905 for diagnosis of UPA, with optimized sensitivity of 96.6% for PA, and 100.0% for UPA, at validation. The application of the predicting tools allowed the identification of a subgroup of patients with very low risk of PA (0.6% for both models) and null probability of having UPA. In conclusion, this score and the machine learning algorithm can accurately predict the individual pretest probability of PA in patients with hypertension and circumvent screening in up to 32.7% of patients using a machine learning-based model, without omitting patients with surgically curable UPA.


Sign in / Sign up

Export Citation Format

Share Document