scholarly journals Machine learning models perform better than traditional empirical models for stomatal conductance when applied to multiple tree species across different forest biomes

2021 ◽  
Vol 6 ◽  
pp. 100139
Author(s):  
Alta Saunders ◽  
David M. Drew ◽  
Willie Brink
2020 ◽  
Vol 22 (10) ◽  
pp. 694-704 ◽  
Author(s):  
Wanben Zhong ◽  
Bineng Zhong ◽  
Hongbo Zhang ◽  
Ziyi Chen ◽  
Yan Chen

Aim and Objective: Cancer is one of the deadliest diseases, taking the lives of millions every year. Traditional methods of treating cancer are expensive and toxic to normal cells. Fortunately, anti-cancer peptides (ACPs) can eliminate this side effect. However, the identification and development of new anti Materials and Methods: In our study, a multi-classifier system was used, combined with multiple machine learning models, to predict anti-cancer peptides. These individual learners are composed of different feature information and algorithms, and form a multi-classifier system by voting. Results and Conclusion: The experiments show that the overall prediction rate of each individual learner is above 80% and the overall accuracy of multi-classifier system for anti-cancer peptides prediction can reach 95.93%, which is better than the existing prediction model.


ADMET & DMPK ◽  
2020 ◽  
Author(s):  
John Mitchell

<p class="ADMETabstracttext">We describe three machine learning models submitted to the 2019 Solubility Challenge. All are founded on tree-like classifiers, with one model being based on Random Forest and another on the related Extra Trees algorithm. The third model is a consensus predictor combining the former two with a Bagging classifier. We call this consensus classifier Vox Machinarum, and here discuss how it benefits from the Wisdom of Crowds. On the first 2019 Solubility Challenge test set of 100 low-variance intrinsic aqueous solubilities, Extra Trees is our best classifier. One the other, a high-variance set of 32 molecules, we find that Vox Machinarum and Random Forest both perform a little better than Extra Trees, and almost equally to one another. We also compare the gold standard solubilities from the 2019 Solubility Challenge with a set of literature-based solubilities for most of the same compounds.</p>


2021 ◽  
Author(s):  
Jan Wolff ◽  
Ansgar Klimke ◽  
Michael Marschollek ◽  
Tim Kacprowski

Introduction The COVID-19 pandemic has strong effects on most health care systems and individual services providers. Forecasting of admissions can help for the efficient organisation of hospital care. We aimed to forecast the number of admissions to psychiatric hospitals before and during the COVID-19 pandemic and we compared the performance of machine learning models and time series models. This would eventually allow to support timely resource allocation for optimal treatment of patients. Methods We used admission data from 9 psychiatric hospitals in Germany between 2017 and 2020. We compared machine learning models with time series models in weekly, monthly and yearly forecasting before and during the COVID-19 pandemic. Our models were trained and validated with data from the first two years and tested in prospectively sliding time-windows in the last two years. Results A total of 90,686 admissions were analysed. The models explained up to 90% of variance in hospital admissions in 2019 and 75% in 2020 with the effects of the COVID-19 pandemic. The best models substantially outperformed a one-step seasonal naive forecast (seasonal mean absolute scaled error (sMASE) 2019: 0.59, 2020: 0.76). The best model in 2019 was a machine learning model (elastic net, mean absolute error (MAE): 7.25). The best model in 2020 was a time series model (exponential smoothing state space model with Box-Cox transformation, ARMA errors and trend and seasonal components, MAE: 10.44), which adjusted more quickly to the shock effects of the COVID-19 pandemic. Models forecasting admissions one week in advance did not perform better than monthly and yearly models in 2019 but they did in 2020. The most important features for the machine learning models were calendrical variables. Conclusion Model performance did not vary much between different modelling approaches before the COVID-19 pandemic and established forecasts were substantially better than one-step seasonal naive forecasts. However, weekly time series models adjusted quicker to the COVID-19 related shock effects. In practice, different forecast horizons could be used simultaneously to allow both early planning and quick adjustments to external effects.


2021 ◽  
Vol 3 ◽  
Author(s):  
Ali Alim-Marvasti ◽  
Fernando Pérez-García ◽  
Karan Dahele ◽  
Gloria Romagnoli ◽  
Beate Diehl ◽  
...  

Background: Epilepsy affects 50 million people worldwide and a third are refractory to medication. If a discrete cerebral focus or network can be identified, neurosurgical resection can be curative. Most excisions are in the temporal-lobe, and are more likely to result in seizure-freedom than extra-temporal resections. However, less than half of patients undergoing surgery become entirely seizure-free. Localizing the epileptogenic-zone and individualized outcome predictions are difficult, requiring detailed evaluations at specialist centers.Methods: We used bespoke natural language processing to text-mine 3,800 electronic health records, from 309 epilepsy surgery patients, evaluated over a decade, of whom 126 remained entirely seizure-free. We investigated the diagnostic performances of machine learning models using set-of-semiology (SoS) with and without hippocampal sclerosis (HS) on MRI as features, using STARD criteria.Findings: Support Vector Classifiers (SVC) and Gradient Boosted (GB) decision trees were the best performing algorithms for temporal-lobe epileptogenic zone localization (cross-validated Matthews correlation coefficient (MCC) SVC 0.73 ± 0.25, balanced accuracy 0.81 ± 0.14, AUC 0.95 ± 0.05). Models that only used seizure semiology were not always better than internal benchmarks. The combination of multimodal features, however, enhanced performance metrics including MCC and normalized mutual information (NMI) compared to either alone (p &lt; 0.0001). This combination of semiology and HS on MRI increased both cross-validated MCC and NMI by over 25% (NMI, SVC SoS: 0.35 ± 0.28 vs. SVC SoS+HS: 0.61 ± 0.27).Interpretation: Machine learning models using only the set of seizure semiology (SoS) cannot unequivocally perform better than benchmarks in temporal epileptogenic-zone localization. However, the combination of SoS with an imaging feature (HS) enhance epileptogenic lobe localization. We quantified this added NMI value to be 25% in absolute terms. Despite good performance in localization, no model was able to predict seizure-freedom better than benchmarks. The methods used are widely applicable, and the performance enhancements by combining other clinical, imaging and neurophysiological features could be similarly quantified. Multicenter studies are required to confirm generalizability.Funding: Wellcome/EPSRC Center for Interventional and Surgical Sciences (WEISS) (203145Z/16/Z).


2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
Sejoong Kim ◽  
Yeonhee Lee ◽  
Seung Seok Han

Abstract Background and Aims The precise prediction of acute kidney injury (AKI) after nephrectomy for renal cell carcinoma (RCC) is an important issue because of its relationship with subsequent kidney dysfunction and high mortality. Herein we addressed whether machine learning algorithms could predict postoperative AKI risk better than conventional logistic regression (LR) models. Method A total of 4,104 RCC patients who had undergone unilateral nephrectomy from January 2003 to December 2017 were reviewed. Machine learning models such as support vector machine, random forest, extreme gradient boosting, and light gradient boosting machine (LightGBM) were developed, and their performance based on the area under the receiver operating characteristic curve, accuracy, and F1 score was compared with that of the LR-based scoring model. Results Postoperative AKI developed in 1,167 patients (28.4%). All the machine learning models had higher performance index values than the LR-based scoring model. Among them, the LightGBM model had the highest value of 0.810 (0.783–0.837). The decision curve analysis demonstrated a greater net benefit of the machine learning models than the LR-based scoring model over all the ranges of threshold probabilities. The LightGBM and random forest models, but not others, were well calibrated. Conclusion The application of machine learning algorithms improves the predictability of AKI after nephrectomy for RCC, and these models perform better than conventional LR-based models.


2021 ◽  
Author(s):  
Yue Jia ◽  
Yongjun Su ◽  
Fengchun Wang ◽  
Pengcheng Li ◽  
Shuyi Huo

Abstract Reliable global solar radiation (Rs) information is crucial for the design and management of solar energy systems for agricultural and industrial production. However, Rs measurements are unavailable in many regions of the world, which impedes the development and application of solar energy. To accurately estimate Rs, this study developed a novel machine learning model, called a Gaussian exponential model (GEM), for daily global Rs estimation. The GEM was compared with four other machine learning models and two empirical models to assess its applicability using daily meteorological data from 1997–2016 from four stations in Northeast China. The results showed that the GEM with complete inputs had the best performance. Machine learning models provided better estimates than empirical models when trained by the same input data. Sunshine duration was the most effective factor determining the accuracy of the machine learning models. Overall, the GEM with complete inputs had the highest accuracy and is recommended for modeling daily Rs in Northeast China.


2021 ◽  
Author(s):  
Liying Mo ◽  
Yuangang Su ◽  
Jianhui Yuan ◽  
Zhiwei Xiao ◽  
Ziyan Zhang ◽  
...  

Abstract Background: Machine learning methods showed excellent predictive ability in a wide range of fields. For the survival of head and neck squamous cell carcinoma (HNSC), its multi-omics influence is crucial. This study attempts to establish a variety of machine learning multi-omics models to predict the survival of HNSC and find the most suitable machine learning prediction method. Results: For omics of HNSC, the results of the six models all showed that the performance of multi-omics was better than each single-omic alone. Results were presented which showed that the BN model played a good prediction performance (area under the curve [AUC] 0.8250) in HNSC multi-omics data. The other machine learning models RF (AUC = 0.8002), NN (AUC = 0.7200), and GLM (AUC = 0.7145) also showed high predictive performance except for DT(AUC = 0.5149) and SVM(AUC = 0.6981). And the results of a vitro qPCR were consistent with the Random forest algorithm. Conclusion: Machine learning methods could better forecast the survival outcome of HNSC. Meanwhile, this study found that the Bayesian network was the most superior. Moreover, the forecast result of multi-omics was better than single-omic alone in HNSC.


2021 ◽  
Vol 3 ◽  
Author(s):  
Paul Buchmann ◽  
Timothy DelSole

This paper shows that skillful week 3–4 predictions of a large-scale pattern of 2 m temperature over the US can be made based on the Nino3.4 index alone, where skillful is defined to be better than climatology. To find more skillful regression models, this paper explores various machine learning strategies (e.g., ridge regression and lasso), including those trained on observations and on climate model output. It is found that regression models trained on climate model output yield more skillful predictions than regression models trained on observations, presumably because of the larger training sample. Nevertheless, the skill of the best machine learning models are only modestly better than ordinary least squares based on the Nino3.4 index. Importantly, this fact is difficult to infer from the parameters of the machine learning model because very different parameter sets can produce virtually identical predictions. For this reason, attempts to interpret the source of predictability from the machine learning model can be very misleading. The skill of machine learning models also are compared to those of a fully coupled dynamical model, CFSv2. The results depend on the skill measure: for mean square error, the dynamical model is slightly worse than the machine learning models; for correlation skill, the dynamical model is only modestly better than machine learning models or the Nino3.4 index. In summary, the best predictions of the large-scale pattern come from machine learning models trained on long climate simulations, but the skill is only modestly better than predictions based on the Nino3.4 index alone.


2021 ◽  
Author(s):  
_ _

Abstract For the past century, optimization of drilling has caught the eyes of many researchers. The main areas center on ROP, fluid treatment, and bit selection. They all share the same goal of maximizing ROP and reducing NPT. In other to develop an optimal control system, ROP must be predicted accurately, unfortunately, it is a complex parameter that is affected by multiple drilling parameters, rock properties, fluid properties, and bit selection. Models used for prediction have developed from empirical models like Bourgoyne and Young's to more intelligent models such as SVM and ANN. With the continuous increase in data obtained from sensors while drilling, there is still much work to be done in this field. In this research, the improvement of an empirical model and the development of an intelligent model are presented. The Bourgoyne and Young's model uses multiple linear regression to estimate coefficients which it then inserts into an empirical formula to predict ROP. This model was modified using non-linear curve-fitting to estimate the coefficients and make it reduce bias to generalize better. Machine learning models such as Gradient Boosting, Random Forest, ANN, and DNN were used in the development of a predictive model for the ROP. These models were easier to develop compared to the empirical model since they rely more on data rather than statistical formulas. The data used in this research include drilling data from 3 wells drilled in 2 fields within the Niger Delta region in Nigeria. The models were developed and trained on one of the wells, while the remaining two were used for testing the performance of the models. The modified empirical model improved the efficiency of the base model by 14% during validation but performs poorly on unseen data from the other two wells. The Machine learning models outperform the empirical models and perform accurately on unseen data from the other wells. DNN was the best performing model achieving an average accuracy of 0.987 for the 3 wells.


Sign in / Sign up

Export Citation Format

Share Document