Machine learning models perform better than traditional empirical models for stomatal conductance when applied to multiple tree species across different forest biomes

Identification of Anti-cancer Peptides Based on Multi-classifier System

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207322666191203141102 ◽

2020 ◽

Vol 22 (10) ◽

pp. 694-704 ◽

Cited By ~ 2

Author(s):

Wanben Zhong ◽

Bineng Zhong ◽

Hongbo Zhang ◽

Ziyi Chen ◽

Yan Chen

Keyword(s):

Machine Learning ◽

Side Effect ◽

Learning Models ◽

Normal Cells ◽

Classifier System ◽

Prediction Rate ◽

Anti Cancer ◽

Feature Information ◽

Machine Learning Models ◽

Better Than

Aim and Objective: Cancer is one of the deadliest diseases, taking the lives of millions every year. Traditional methods of treating cancer are expensive and toxic to normal cells. Fortunately, anti-cancer peptides (ACPs) can eliminate this side effect. However, the identification and development of new anti Materials and Methods: In our study, a multi-classifier system was used, combined with multiple machine learning models, to predict anti-cancer peptides. These individual learners are composed of different feature information and algorithms, and form a multi-classifier system by voting. Results and Conclusion: The experiments show that the overall prediction rate of each individual learner is above 80% and the overall accuracy of multi-classifier system for anti-cancer peptides prediction can reach 95.93%, which is better than the existing prediction model.

Download Full-text

Three machine learning models for the 2019 Solubility Challenge

ADMET & DMPK ◽

10.5599/admet.835 ◽

2020 ◽

Cited By ~ 1

Author(s):

John Mitchell

Keyword(s):

Machine Learning ◽

Random Forest ◽

Gold Standard ◽

Challenge Test ◽

Wisdom Of Crowds ◽

Learning Models ◽

The Third ◽

Aqueous Solubilities ◽

Machine Learning Models ◽

Better Than

<p class="ADMETabstracttext">We describe three machine learning models submitted to the 2019 Solubility Challenge. All are founded on tree-like classifiers, with one model being based on Random Forest and another on the related Extra Trees algorithm. The third model is a consensus predictor combining the former two with a Bagging classifier. We call this consensus classifier Vox Machinarum, and here discuss how it benefits from the Wisdom of Crowds. On the first 2019 Solubility Challenge test set of 100 low-variance intrinsic aqueous solubilities, Extra Trees is our best classifier. One the other, a high-variance set of 32 molecules, we find that Vox Machinarum and Random Forest both perform a little better than Extra Trees, and almost equally to one another. We also compare the gold standard solubilities from the 2019 Solubility Challenge with a set of literature-based solubilities for most of the same compounds.</p>

Download Full-text

Forecasting admissions in psychiatric hospitals before and during Covid-19

10.1101/2021.07.16.21260200 ◽

2021 ◽

Author(s):

Jan Wolff ◽

Ansgar Klimke ◽

Michael Marschollek ◽

Tim Kacprowski

Keyword(s):

Machine Learning ◽

Time Series ◽

Hospital Admissions ◽

Model Performance ◽

Psychiatric Hospitals ◽

Time Series Models ◽

Learning Models ◽

One Step ◽

Machine Learning Models ◽

Better Than

Introduction The COVID-19 pandemic has strong effects on most health care systems and individual services providers. Forecasting of admissions can help for the efficient organisation of hospital care. We aimed to forecast the number of admissions to psychiatric hospitals before and during the COVID-19 pandemic and we compared the performance of machine learning models and time series models. This would eventually allow to support timely resource allocation for optimal treatment of patients. Methods We used admission data from 9 psychiatric hospitals in Germany between 2017 and 2020. We compared machine learning models with time series models in weekly, monthly and yearly forecasting before and during the COVID-19 pandemic. Our models were trained and validated with data from the first two years and tested in prospectively sliding time-windows in the last two years. Results A total of 90,686 admissions were analysed. The models explained up to 90% of variance in hospital admissions in 2019 and 75% in 2020 with the effects of the COVID-19 pandemic. The best models substantially outperformed a one-step seasonal naive forecast (seasonal mean absolute scaled error (sMASE) 2019: 0.59, 2020: 0.76). The best model in 2019 was a machine learning model (elastic net, mean absolute error (MAE): 7.25). The best model in 2020 was a time series model (exponential smoothing state space model with Box-Cox transformation, ARMA errors and trend and seasonal components, MAE: 10.44), which adjusted more quickly to the shock effects of the COVID-19 pandemic. Models forecasting admissions one week in advance did not perform better than monthly and yearly models in 2019 but they did in 2020. The most important features for the machine learning models were calendrical variables. Conclusion Model performance did not vary much between different modelling approaches before the COVID-19 pandemic and established forecasts were substantially better than one-step seasonal naive forecasts. However, weekly time series models adjusted quicker to the COVID-19 related shock effects. In practice, different forecast horizons could be used simultaneously to allow both early planning and quick adjustments to external effects.

Download Full-text

Machine Learning for Localizing Epileptogenic-Zone in the Temporal Lobe: Quantifying the Value of Multimodal Clinical-Semiology and Imaging Concordance

Frontiers in Digital Health ◽

10.3389/fdgth.2021.559103 ◽

2021 ◽

Vol 3 ◽

Author(s):

Ali Alim-Marvasti ◽

Fernando Pérez-García ◽

Karan Dahele ◽

Gloria Romagnoli ◽

Beate Diehl ◽

...

Keyword(s):

Machine Learning ◽

Temporal Lobe ◽

Support Vector ◽

Seizure Freedom ◽

Epileptogenic Zone ◽

Learning Models ◽

Seizure Semiology ◽

Multimodal Features ◽

Machine Learning Models ◽

Better Than

Background: Epilepsy affects 50 million people worldwide and a third are refractory to medication. If a discrete cerebral focus or network can be identified, neurosurgical resection can be curative. Most excisions are in the temporal-lobe, and are more likely to result in seizure-freedom than extra-temporal resections. However, less than half of patients undergoing surgery become entirely seizure-free. Localizing the epileptogenic-zone and individualized outcome predictions are difficult, requiring detailed evaluations at specialist centers.Methods: We used bespoke natural language processing to text-mine 3,800 electronic health records, from 309 epilepsy surgery patients, evaluated over a decade, of whom 126 remained entirely seizure-free. We investigated the diagnostic performances of machine learning models using set-of-semiology (SoS) with and without hippocampal sclerosis (HS) on MRI as features, using STARD criteria.Findings: Support Vector Classifiers (SVC) and Gradient Boosted (GB) decision trees were the best performing algorithms for temporal-lobe epileptogenic zone localization (cross-validated Matthews correlation coefficient (MCC) SVC 0.73 ± 0.25, balanced accuracy 0.81 ± 0.14, AUC 0.95 ± 0.05). Models that only used seizure semiology were not always better than internal benchmarks. The combination of multimodal features, however, enhanced performance metrics including MCC and normalized mutual information (NMI) compared to either alone (p < 0.0001). This combination of semiology and HS on MRI increased both cross-validated MCC and NMI by over 25% (NMI, SVC SoS: 0.35 ± 0.28 vs. SVC SoS+HS: 0.61 ± 0.27).Interpretation: Machine learning models using only the set of seizure semiology (SoS) cannot unequivocally perform better than benchmarks in temporal epileptogenic-zone localization. However, the combination of SoS with an imaging feature (HS) enhance epileptogenic lobe localization. We quantified this added NMI value to be 25% in absolute terms. Despite good performance in localization, no model was able to predict seizure-freedom better than benchmarks. The methods used are widely applicable, and the performance enhancements by combining other clinical, imaging and neurophysiological features could be similarly quantified. Multicenter studies are required to confirm generalizability.Funding: Wellcome/EPSRC Center for Interventional and Surgical Sciences (WEISS) (203145Z/16/Z).

Download Full-text

MO353MACHINE LEARNING-BASED PREDICTION OF ACUTE KIDNEY INJURY AFTER NEPHRECTOMY IN PATIENTS WITH RENAL CELL CARCINOMA

Nephrology Dialysis Transplantation ◽

10.1093/ndt/gfab082.007 ◽

2021 ◽

Vol 36 (Supplement_1) ◽

Author(s):

Sejoong Kim ◽

Yeonhee Lee ◽

Seung Seok Han

Keyword(s):

Machine Learning ◽

Renal Cell Carcinoma ◽

Kidney Injury ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Learning Models ◽

Scoring Model ◽

Postoperative Aki ◽

Machine Learning Models ◽

Better Than

Abstract Background and Aims The precise prediction of acute kidney injury (AKI) after nephrectomy for renal cell carcinoma (RCC) is an important issue because of its relationship with subsequent kidney dysfunction and high mortality. Herein we addressed whether machine learning algorithms could predict postoperative AKI risk better than conventional logistic regression (LR) models. Method A total of 4,104 RCC patients who had undergone unilateral nephrectomy from January 2003 to December 2017 were reviewed. Machine learning models such as support vector machine, random forest, extreme gradient boosting, and light gradient boosting machine (LightGBM) were developed, and their performance based on the area under the receiver operating characteristic curve, accuracy, and F1 score was compared with that of the LR-based scoring model. Results Postoperative AKI developed in 1,167 patients (28.4%). All the machine learning models had higher performance index values than the LR-based scoring model. Among them, the LightGBM model had the highest value of 0.810 (0.783–0.837). The decision curve analysis demonstrated a greater net benefit of the machine learning models than the LR-based scoring model over all the ranges of threshold probabilities. The LightGBM and random forest models, but not others, were well calibrated. Conclusion The application of machine learning algorithms improves the predictability of AKI after nephrectomy for RCC, and these models perform better than conventional LR-based models.

Download Full-text

Global solar radiation modeling using different machine learning and empirical models in Northeast China

10.21203/rs.3.rs-422151/v1 ◽

2021 ◽

Author(s):

Yue Jia ◽

Yongjun Su ◽

Fengchun Wang ◽

Pengcheng Li ◽

Shuyi Huo

Keyword(s):

Machine Learning ◽

Solar Radiation ◽

Solar Energy ◽

Northeast China ◽

Sunshine Duration ◽

Meteorological Data ◽

Global Solar Radiation ◽

Empirical Models ◽

Learning Models ◽

Machine Learning Models

Abstract Reliable global solar radiation (Rs) information is crucial for the design and management of solar energy systems for agricultural and industrial production. However, Rs measurements are unavailable in many regions of the world, which impedes the development and application of solar energy. To accurately estimate Rs, this study developed a novel machine learning model, called a Gaussian exponential model (GEM), for daily global Rs estimation. The GEM was compared with four other machine learning models and two empirical models to assess its applicability using daily meteorological data from 1997–2016 from four stations in Northeast China. The results showed that the GEM with complete inputs had the best performance. Machine learning models provided better estimates than empirical models when trained by the same input data. Sunshine duration was the most effective factor determining the accuracy of the machine learning models. Overall, the GEM with complete inputs had the highest accuracy and is recommended for modeling daily Rs in Northeast China.

Download Full-text

Comparisons of forecasting for Survival Outcome for Head and Neck Squamous Cell Carcinoma by using Six Machine Learning Models Based on Multi-Omics

10.21203/rs.3.rs-1100398/v1 ◽

2021 ◽

Author(s):

Liying Mo ◽

Yuangang Su ◽

Jianhui Yuan ◽

Zhiwei Xiao ◽

Ziyan Zhang ◽

...

Keyword(s):

Machine Learning ◽

Squamous Cell Carcinoma ◽

Cell Carcinoma ◽

Head And Neck ◽

Squamous Cell ◽

Survival Outcome ◽

Learning Models ◽

Machine Learning Methods ◽

Machine Learning Models ◽

Better Than

Abstract Background: Machine learning methods showed excellent predictive ability in a wide range of fields. For the survival of head and neck squamous cell carcinoma (HNSC), its multi-omics influence is crucial. This study attempts to establish a variety of machine learning multi-omics models to predict the survival of HNSC and find the most suitable machine learning prediction method. Results: For omics of HNSC, the results of the six models all showed that the performance of multi-omics was better than each single-omic alone. Results were presented which showed that the BN model played a good prediction performance (area under the curve [AUC] 0.8250) in HNSC multi-omics data. The other machine learning models RF (AUC = 0.8002), NN (AUC = 0.7200), and GLM (AUC = 0.7145) also showed high predictive performance except for DT(AUC = 0.5149) and SVM(AUC = 0.6981). And the results of a vitro qPCR were consistent with the Random forest algorithm. Conclusion: Machine learning methods could better forecast the survival outcome of HNSC. Meanwhile, this study found that the Bayesian network was the most superior. Moreover, the forecast result of multi-omics was better than single-omic alone in HNSC.

Download Full-text

Week 3–4 Prediction of Wintertime CONUS Temperature Using Machine Learning Techniques

Frontiers in Climate ◽

10.3389/fclim.2021.697423 ◽

2021 ◽

Vol 3 ◽

Author(s):

Paul Buchmann ◽

Timothy DelSole

Keyword(s):

Machine Learning ◽

Regression Models ◽

Large Scale ◽

Climate Model ◽

Dynamical Model ◽

Learning Models ◽

Machine Learning Model ◽

Climate Model Output ◽

Machine Learning Models ◽

Better Than

This paper shows that skillful week 3–4 predictions of a large-scale pattern of 2 m temperature over the US can be made based on the Nino3.4 index alone, where skillful is defined to be better than climatology. To find more skillful regression models, this paper explores various machine learning strategies (e.g., ridge regression and lasso), including those trained on observations and on climate model output. It is found that regression models trained on climate model output yield more skillful predictions than regression models trained on observations, presumably because of the larger training sample. Nevertheless, the skill of the best machine learning models are only modestly better than ordinary least squares based on the Nino3.4 index. Importantly, this fact is difficult to infer from the parameters of the machine learning model because very different parameter sets can produce virtually identical predictions. For this reason, attempts to interpret the source of predictability from the machine learning model can be very misleading. The skill of machine learning models also are compared to those of a fully coupled dynamical model, CFSv2. The results depend on the skill measure: for mean square error, the dynamical model is slightly worse than the machine learning models; for correlation skill, the dynamical model is only modestly better than machine learning models or the Nino3.4 index. In summary, the best predictions of the large-scale pattern come from machine learning models trained on long climate simulations, but the skill is only modestly better than predictions based on the Nino3.4 index alone.

Download Full-text

Machine Learning Models Prognosticate Functional Outcomes Better than Clinical Scores in Spontaneous Intracerebral Haemorrhage

Journal of Stroke and Cerebrovascular Diseases ◽

10.1016/j.jstrokecerebrovasdis.2021.106234 ◽

2022 ◽

Vol 31 (2) ◽

pp. 106234

Author(s):

Mervyn Jun Rui Lim ◽

Raphael Hao Chong Quek ◽

Kai Jie Ng ◽

Ne-Hooi Will Loh ◽

Sein Lwin ◽

...

Keyword(s):

Machine Learning ◽

Intracerebral Haemorrhage ◽

Functional Outcomes ◽

Learning Models ◽

Clinical Scores ◽

Spontaneous Intracerebral Haemorrhage ◽

Machine Learning Models ◽

Better Than

Download Full-text

Development of an Optimal Model For Rate of Penetration Rop Using Deep Neural Networks DNN.

10.2118/207161-ms ◽

2021 ◽

Author(s):

_ _

Keyword(s):

Machine Learning ◽

Empirical Model ◽

Empirical Models ◽

The Other ◽

Gradient Boosting ◽

Past Century ◽

Learning Models ◽

Continuous Increase ◽

Unseen Data ◽

Machine Learning Models

Abstract For the past century, optimization of drilling has caught the eyes of many researchers. The main areas center on ROP, fluid treatment, and bit selection. They all share the same goal of maximizing ROP and reducing NPT. In other to develop an optimal control system, ROP must be predicted accurately, unfortunately, it is a complex parameter that is affected by multiple drilling parameters, rock properties, fluid properties, and bit selection. Models used for prediction have developed from empirical models like Bourgoyne and Young's to more intelligent models such as SVM and ANN. With the continuous increase in data obtained from sensors while drilling, there is still much work to be done in this field. In this research, the improvement of an empirical model and the development of an intelligent model are presented. The Bourgoyne and Young's model uses multiple linear regression to estimate coefficients which it then inserts into an empirical formula to predict ROP. This model was modified using non-linear curve-fitting to estimate the coefficients and make it reduce bias to generalize better. Machine learning models such as Gradient Boosting, Random Forest, ANN, and DNN were used in the development of a predictive model for the ROP. These models were easier to develop compared to the empirical model since they rely more on data rather than statistical formulas. The data used in this research include drilling data from 3 wells drilled in 2 fields within the Niger Delta region in Nigeria. The models were developed and trained on one of the wells, while the remaining two were used for testing the performance of the models. The modified empirical model improved the efficiency of the base model by 14% during validation but performs poorly on unseen data from the other two wells. The Machine learning models outperform the empirical models and perform accurately on unseen data from the other wells. DNN was the best performing model achieving an average accuracy of 0.987 for the 3 wells.

Download Full-text