Probabilistic analysis of soil-water characteristic curve based on machine learning algorithms

Objectives: This study aims to investigate whether the machine learning algorithms could provide an optimal early mortality prediction method compared with other scoring systems for patients with cerebral hemorrhage in intensive care units in clinical practice.Methods: Between 2008 and 2012, from Intensive Care III (MIMIC-III) database, all cerebral hemorrhage patients monitored with the MetaVision system and admitted to intensive care units were enrolled in this study. The calibration, discrimination, and risk classification of predicted hospital mortality based on machine learning algorithms were assessed. The primary outcome was hospital mortality. Model performance was assessed with accuracy and receiver operating characteristic curve analysis.Results: Of 760 cerebral hemorrhage patients enrolled from MIMIC database [mean age, 68.2 years (SD, ±15.5)], 383 (50.4%) patients died in hospital, and 377 (49.6%) patients survived. The area under the receiver operating characteristic curve (AUC) of six machine learning algorithms was 0.600 (nearest neighbors), 0.617 (decision tree), 0.655 (neural net), 0.671(AdaBoost), 0.819 (random forest), and 0.725 (gcForest). The AUC was 0.423 for Acute Physiology and Chronic Health Evaluation II score. The random forest had the highest specificity and accuracy, as well as the greatest AUC, showing the best ability to predict in-hospital mortality.Conclusions: Compared with conventional scoring system and the other five machine learning algorithms in this study, random forest algorithm had better performance in predicting in-hospital mortality for cerebral hemorrhage patients in intensive care units, and thus further research should be conducted on random forest algorithm.

Download Full-text

Machine Learning Algorithms for Prediabetes Risk Calculation: a Protocol of Systematic Review and Meta-Analysis

10.21203/rs.3.rs-578915/v1 ◽

2021 ◽

Author(s):

Yaltafit Abror Jeem ◽

Refa Nabila ◽

Dwi Ditha Emelia ◽

Lutfan Lazuardi ◽

Hari Kusnanto Josef

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Diagnostic Test ◽

Meta Analysis ◽

Characteristic Curve ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Assessment Tools ◽

Forest Plot ◽

Test Accuracy

Abstract Background One strategy to resolve the increasing prevalence of T2DM is to identify and administer interventions to prediabetes patients. Risk assessment tools help detect diseases, by allowing screening to the high risk group. Machine learning is also used to help diagnosis and identification of prediabetes. This review aims to determine the diagnostic test accuracy of various machine learning algorithms for calculating prediabetes risk.Methods This protocol was written in compliance with the Preferred Reporting Items for Systematic Review and Meta-Analysis for Protocols (PRISMA-P) statement. The databases that will be used include PubMed, ProQuest and EBSCO restricted to January 1999 and May 2019 in English language only. Identification of articles will be done independently by two reviewers through the titles, the abstracts, and then the full-text-articles. Any disagreement will be resolved by consensus. The Newcastle-Ottawa Quality Assessment Scale will be used to measure the quality and potential of bias. Data extraction and content analysis will be performed systematically. Quantitative data will be visualized using a forest plot with the 95% Confidence Intervals. The diagnostic test outcome will be described by the summary receiver operating characteristic curve. Data will be analyzed using Review Manager 5.3 (RevMan 5.3) software package.Discussion We will obtain diagnostic accuracy of various machine learning algorithms for prediabetes risk estimation using this proposed systematic review and meta-analysis. Systematic review registration: This protocol has been registered in the Prospective Registry of Systematic Review (PROSPERO) database. The registration number is CRD42021251242.

Download Full-text

Abstract 13455: Predicting Emergency Department Disposition at Triage for Suspected Patients With Acute Coronary Syndrome Using Machine Learning Algorithms

Circulation ◽

10.1161/circ.142.suppl_3.13455 ◽

2020 ◽

Vol 142 (Suppl_3) ◽

Author(s):

Stephanie O Frisch ◽

Zeineb Bouzid ◽

Jessica Zègre-Hemsey ◽

Clifton W CALLAWAY ◽

Holli A Devon ◽

...

Keyword(s):

Machine Learning ◽

Acute Coronary Syndrome ◽

Critical Care ◽

Random Forest ◽

Characteristic Curve ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Coronary Syndrome ◽

Critical Care Admission

Introduction: Overcrowded emergency departments (ED) and undifferentiated patients make the provision of care and resources challenging. We examined whether machine learning algorithms could identify ED patients’ disposition (hospitalization and critical care admission) using readily available objective triage data among patients with symptoms suggestive of acute coronary syndrome (ACS). Methods: This was a retrospective observational cohort study of adult patients who were triaged at the ED for a suspected coronary event. A total of 162 input variables (k) were extracted from the electronic health record: demographics (k=3), mode of transportation (k=1), past medical/surgical history (k=57), first ED vital signs (k=7), home medications (k=31), symptomology (k=40), and the computer generated automatic interpretation of 12-lead electrocardiogram (k=23). The primary outcomes were hospitalization and critical care admission (i.e., admission to intensive or step-down care unit). We used 10-fold stratified cross validation to evaluate the performance of five machine learning algorithms to predict the study outcomes: logistic regression, naïve Bayes, random forest, gradient boosting and artificial neural network classifiers. We determined the best model by comparing the area under the receiver operating characteristic curve (AUC) of all models. Results: Included were 1201 patients (age 64±14, 39% female; 10% Black) with a total of 956 hospitalizations, and 169 critical care admissions. The best performing machine learning classifier for the outcome of hospitalization was gradient boosting machine with an AUC of 0.85 (95% CI, 0.82–0.89), 89% sensitivity, and F-score of 0.83; random forest classifier performed the best for the outcome of critical care admission with an AUC of 0.73 (95% CI, 0.70–0.77), 76% sensitivity, and F-score of 0.56. Conclusion: Predictive machine learning algorithms demonstrate excellent to good discriminative power to predict hospitalization and critical care admission, respectively. Administrators and clinicians could benefit from machine learning approaches to predict hospitalization and critical care admission, to optimize and allocate scarce ED and hospital resources and provide optimal care.

Download Full-text

Incorporating Global Distribution with Site-Specific Data for Probabilistic Analysis of Soil–Water Characteristic Curve of Bentonite

Journal of Hazardous Toxic and Radioactive Waste ◽

10.1061/(asce)hz.2153-5515.0000624 ◽

2021 ◽

Vol 25 (3) ◽

Author(s):

Atma Sharma ◽

Budhaditya Hazra ◽

C. B. Gupt ◽

Sreedeep Sekharan

Keyword(s):

Soil Water ◽

Probabilistic Analysis ◽

Characteristic Curve ◽

Global Distribution ◽

Site Specific ◽

Specific Data ◽

Soil Water Characteristic Curve

Download Full-text

Comparison of Different Ensemble Methods in Credit Card Default Prediction

UHD Journal of Science and Technology ◽

10.21928/uhdjst.v5n2y2021.pp20-25 ◽

2021 ◽

Vol 5 (2) ◽

pp. 20-25

Author(s):

Azhi Abdalmohammed Faraj ◽

Didam Ahmed Mahmud ◽

Bilal Najmaddin Rashid

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Credit Card ◽

Characteristic Curve ◽

Learning Algorithms ◽

Ensemble Methods ◽

Research Problem ◽

Machine Learning Algorithms ◽

Default Prediction ◽

Credit Card Default

Credit card defaults pause a business-critical threat in banking systems thus prompt detection of defaulters is a crucial and challenging research problem. Machine learning algorithms must deal with a heavily skewed dataset since the ratio of defaulters to non-defaulters is very small. The purpose of this research is to apply different ensemble methods and compare their performance in detecting the probability of defaults customer’s credit card default payments in Taiwan from the UCI Machine learning repository. This is done on both the original skewed dataset and then on balanced dataset several studies have showed the superiority of neural networks as compared to traditional machine learning algorithms, the results of our study show that ensemble methods consistently outperform Neural Networks and other machine learning algorithms in terms of F1 score and area under receiver operating characteristic curve regardless of balancing the dataset or ignoring the imbalance

Download Full-text

Decision Tree Ensembles to Predict Coronavirus Disease 2019 Infection: A Comparative Study

Complexity ◽

10.1155/2021/5550344 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Amir Ahmad ◽

Ourooj Safi ◽

Sharaf Malebary ◽

Sami Alesawi ◽

Entisar Alkayal

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Characteristic Curve ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Polymerase Chain Reaction Test ◽

Imbalanced Datasets ◽

Polymerase Chain ◽

Chain Reaction Test ◽

Selection Of

The coronavirus disease 2019 (Covid-19) pandemic has affected most countries of the world. The detection of Covid-19 positive cases is an important step to fight the pandemic and save human lives. The polymerase chain reaction test is the most used method to detect Covid-19 positive cases. Various molecular methods and serological methods have also been explored to detect Covid-19 positive cases. Machine learning algorithms have been applied to various kinds of datasets to predict Covid-19 positive cases. The machine learning algorithms were applied on a Covid-19 dataset based on commonly taken laboratory tests to predict Covid-19 positive cases. These types of datasets are easy to collect. The paper investigates the application of decision tree ensembles which are accurate and robust to the selection of parameters. As there is an imbalance between the number of positive cases and the number of negative cases, decision tree ensembles developed for imbalanced datasets are applied. F-measure, precision, recall, area under the precision-recall curve, and area under the receiver operating characteristic curve are used to compare different decision tree ensembles. Different performance measures suggest that decision tree ensembles developed for imbalanced datasets perform better. Results also suggest that including age as a variable can improve the performance of various ensembles of decision trees.

Download Full-text

The use of machine learning algorithms to design a generalized simplified denitrification model

Biogeosciences Discussions ◽

10.5194/bgd-7-2313-2010 ◽

2010 ◽

Vol 7 (2) ◽

pp. 2313-2360 ◽

Cited By ~ 2

Author(s):

F. Oehler ◽

J. C. Rutherford ◽

G. Coco

Keyword(s):

Machine Learning ◽

Water Content ◽

Soil Water ◽

Soil Water Content ◽

Nitrate Concentration ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Boosted Regression Trees ◽

Nitrogen Emissions ◽

Denitrification Potential

Abstract. We designed generalized simplified models using machine learning algorithms (ML) to assess denitrification at the catchment scale. In particular, we designed an artificial neural network (ANN) to simulate total nitrogen emissions from the denitrification process. Boosted regression trees (BRT, another ML) was also used to analyse the relationships and the relative influences of different input variables towards total denitrification. To calibrate the ANN and BRT models, we used a large database obtained by collating datasets from the literature. We developed a simple methodology to give confidence intervals for the calibration and validation process. Both ML algorithms clearly outperformed a commonly used simplified model of nitrogen emissions, NEMIS. NEMIS is based on denitrification potential, temperature, soil water content and nitrate concentration. The ML models used soil organic matter % in place of a denitrification potential and pH as a fifth input variable. The BRT analysis reaffirms the importance of temperature, soil water content and nitrate concentration. Generality of the ANN model may also be improved if pH is used to differentiate between soil types. Further improvements in model performance can be achieved by lessening dataset effects.

Download Full-text

Probabilistic Analysis of Soil-Water Characteristic Curve of Bentonite: Multivariate Copula Approach

International Journal of Geomechanics ◽

10.1061/(asce)gm.1943-5622.0001554 ◽

2020 ◽

Vol 20 (2) ◽

pp. 04019150 ◽

Cited By ~ 1

Author(s):

Atma Prakash ◽

Budhaditya Hazra ◽

Sreedeep Sekharan

Keyword(s):

Soil Water ◽

Probabilistic Analysis ◽

Characteristic Curve ◽

Soil Water Characteristic Curve ◽

Copula Approach ◽

Multivariate Copula

Download Full-text

Predictive models of medication non-adherence risks of patients with T2D based on multiple machine learning algorithms

BMJ Open Diabetes Research & Care ◽

10.1136/bmjdrc-2019-001055 ◽

2020 ◽

Vol 8 (1) ◽

pp. e001055

Author(s):

Xing-Wei Wu ◽

Heng-Bo Yang ◽

Rong Yuan ◽

En-Wu Long ◽

Rong-Sheng Tong

Keyword(s):

Machine Learning ◽

Medication Adherence ◽

Sample Size ◽

Real World ◽

Characteristic Curve ◽

Learning Algorithms ◽

Medication Possession Ratio ◽

Machine Learning Algorithms ◽

Support Vector ◽

Balanced Sampling

ObjectiveMedication adherence plays a key role in type 2 diabetes (T2D) care. Identifying patients with high risks of non-compliance helps individualized management, especially for China, where medical resources are relatively insufficient. However, models with good predictive capabilities have not been studied. This study aims to assess multiple machine learning algorithms and screen out a model that can be used to predict patients’ non-adherence risks.MethodsA real-world registration study was conducted at Sichuan Provincial People’s Hospital from 1 April 2018 to 30 March 2019. Data of patients with T2D on demographics, disease and treatment, diet and exercise, mental status, and treatment adherence were obtained by face-to-face questionnaires. The medication possession ratio was used to evaluate patients’ medication adherence status. Fourteen machine learning algorithms were applied for modeling, including Bayesian network, Neural Net, support vector machine, and so on, and balanced sampling, data imputation, binning, and methods of feature selection were evaluated by the area under the receiver operating characteristic curve (AUC). We use two-way cross-validation to ensure the accuracy of model evaluation, and we performed a posteriori test on the sample size based on the trend of AUC as the sample size increase.ResultsA total of 401 patients out of 630 candidates were investigated, of which 85 were evaluated as poor adherence (21.20%). A total of 16 variables were selected as potential variables for modeling, and 300 models were built based on 30 machine learning algorithms. Among these algorithms, the AUC of the best capable one was 0.866±0.082. Imputing, oversampling and larger sample size will help improve predictive ability.ConclusionsAn accurate and sensitive adherence prediction model based on real-world registration data was established after evaluating data filling, balanced sampling, and so on, which may provide a technical tool for individualized diabetes care.

Download Full-text