Machine Learning based Regression Model for Prediction of Soil Surface Humidity over Moderately Vegetated Fields

Author(s):  
Emrullah ACAR ◽  
Mehmet Sirac OZERDEM ◽  
Burak Berk USTUNDAG
2020 ◽  
Vol 237 (12) ◽  
pp. 1430-1437
Author(s):  
Achim Langenbucher ◽  
Nóra Szentmáry ◽  
Jascha Wendelstein ◽  
Peter Hoffmann

Abstract Background and Purpose In the last decade, artificial intelligence and machine learning algorithms have been more and more established for the screening and detection of diseases and pathologies, as well as for describing interactions between measures where classical methods are too complex or fail. The purpose of this paper is to model the measured postoperative position of an intraocular lens implant after cataract surgery, based on preoperatively assessed biometric effect sizes using techniques of machine learning. Patients and Methods In this study, we enrolled 249 eyes of patients who underwent elective cataract surgery at Augenklinik Castrop-Rauxel. Eyes were measured preoperatively with the IOLMaster 700 (Carl Zeiss Meditec), as well as preoperatively and postoperatively with the Casia 2 OCT (Tomey). Based on preoperative effect sizes axial length, corneal thickness, internal anterior chamber depth, thickness of the crystalline lens, mean corneal radius and corneal diameter a selection of 17 machine learning algorithms were tested for prediction performance for calculation of internal anterior chamber depth (AQD_post) and axial position of equatorial plane of the lens in the pseudophakic eye (LEQ_post). Results The 17 machine learning algorithms (out of 4 families) varied in root mean squared/mean absolute prediction error between 0.187/0.139 mm and 0.255/0.204 mm (AQD_post) and 0.183/0.135 mm and 0.253/0.206 mm (LEQ_post), using 5-fold cross validation techniques. The Gaussian Process Regression Model using an exponential kernel showed the best performance in terms of root mean squared error for prediction of AQDpost and LEQpost. If the entire dataset is used (without splitting for training and validation data), comparison of a simple multivariate linear regression model vs. the algorithm with the best performance showed a root mean squared prediction error for AQD_post/LEQ_post with 0.188/0.187 mm vs. the best performance Gaussian Process Regression Model with 0.166/0.159 mm. Conclusion In this paper we wanted to show the principles of supervised machine learning applied to prediction of the measured physical postoperative axial position of the intraocular lenses. Based on our limited data pool and the algorithms used in our setting, the benefit of machine learning algorithms seems to be limited compared to a standard multivariate regression model.


2020 ◽  
Vol 22 (Supplement_2) ◽  
pp. ii135-ii136
Author(s):  
John Lin ◽  
Michelle Mai ◽  
Saba Paracha

Abstract Glioblastoma multiforme (GBM), the most common form of glioma, is a malignant tumor with a high risk of mortality. By providing accurate survival estimates, prognostic models have been identified as promising tools in clinical decision support. In this study, we produced and validated two machine learning-based models to predict survival time for GBM patients. Publicly available clinical and genomic data from The Cancer Genome Atlas (TCGA) and Broad Institute GDAC Firehouse were obtained through cBioPortal. Random forest and multivariate regression models were created to predict survival. Predictive accuracy was assessed and compared through mean absolute error (MAE) and root mean square error (RMSE) calculations. 619 GBM patients were included in the dataset. There were 381 (62.9%) cases of recurrence/progression and 53 (8.7%) cases of disease-free survival. The MAE and RMSE values were 0.553 and 0.887 years respectively for the random forest regression model, and they were 1.756 and 2.451 years respectively for the multivariate regression model. Both models accurately predicted overall survival. Comparison of models through MAE, RMSE, and visual analysis produced higher accuracy values for random forest than multivariate linear regression. Further investigation on feature selection and model optimization may improve predictive power. These findings suggest that using machine learning in GBM prognostic modeling will improve clinical decision support. *Co-first authors.


2021 ◽  
Vol 9 ◽  
Author(s):  
Fu-Sheng Chou ◽  
Laxmi V. Ghimire

Background: Pediatric myocarditis is a rare disease. The etiologies are multiple. Mortality associated with the disease is 5–8%. Prognostic factors were identified with the use of national hospitalization databases. Applying these identified risk factors for mortality prediction has not been reported.Methods: We used the Kids' Inpatient Database for this project. We manually curated fourteen variables as predictors of mortality based on the current knowledge of the disease, and compared performance of mortality prediction between linear regression models and a machine learning (ML) model. For ML, the random forest algorithm was chosen because of the categorical nature of the variables. Based on variable importance scores, a reduced model was also developed for comparison.Results: We identified 4,144 patients from the database for randomization into the primary (for model development) and testing (for external validation) datasets. We found that the conventional logistic regression model had low sensitivity (~50%) despite high specificity (>95%) or overall accuracy. On the other hand, the ML model struck a good balance between sensitivity (89.9%) and specificity (85.8%). The reduced ML model with top five variables (mechanical ventilation, cardiac arrest, ECMO, acute kidney injury, ventricular fibrillation) were sufficient to approximate the prediction performance of the full model.Conclusions: The ML algorithm performs superiorly when compared to the linear regression model for mortality prediction in pediatric myocarditis in this retrospective dataset. Prospective studies are warranted to further validate the applicability of our model in clinical settings.


2021 ◽  
Vol 8 ◽  
Author(s):  
Robert A. Reed ◽  
Andrei S. Morgan ◽  
Jennifer Zeitlin ◽  
Pierre-Henri Jarreau ◽  
Héloïse Torchin ◽  
...  

Introduction: Preterm babies are a vulnerable population that experience significant short and long-term morbidity. Rehospitalisations constitute an important, potentially modifiable adverse event in this population. Improving the ability of clinicians to identify those patients at the greatest risk of rehospitalisation has the potential to improve outcomes and reduce costs. Machine-learning algorithms can provide potentially advantageous methods of prediction compared to conventional approaches like logistic regression.Objective: To compare two machine-learning methods (least absolute shrinkage and selection operator (LASSO) and random forest) to expert-opinion driven logistic regression modelling for predicting unplanned rehospitalisation within 30 days in a large French cohort of preterm babies.Design, Setting and Participants: This study used data derived exclusively from the population-based prospective cohort study of French preterm babies, EPIPAGE 2. Only those babies discharged home alive and whose parents completed the 1-year survey were eligible for inclusion in our study. All predictive models used a binary outcome, denoting a baby's status for an unplanned rehospitalisation within 30 days of discharge. Predictors included those quantifying clinical, treatment, maternal and socio-demographic factors. The predictive abilities of models constructed using LASSO and random forest algorithms were compared with a traditional logistic regression model. The logistic regression model comprised 10 predictors, selected by expert clinicians, while the LASSO and random forest included 75 predictors. Performance measures were derived using 10-fold cross-validation. Performance was quantified using area under the receiver operator characteristic curve, sensitivity, specificity, Tjur's coefficient of determination and calibration measures.Results: The rate of 30-day unplanned rehospitalisation in the eligible population used to construct the models was 9.1% (95% CI 8.2–10.1) (350/3,841). The random forest model demonstrated both an improved AUROC (0.65; 95% CI 0.59–0.7; p = 0.03) and specificity vs. logistic regression (AUROC 0.57; 95% CI 0.51–0.62, p = 0.04). The LASSO performed similarly (AUROC 0.59; 95% CI 0.53–0.65; p = 0.68) to logistic regression.Conclusions: Compared to an expert-specified logistic regression model, random forest offered improved prediction of 30-day unplanned rehospitalisation in preterm babies. However, all models offered relatively low levels of predictive ability, regardless of modelling method.


Author(s):  
Sachin Kumar ◽  
Karan Veer

Aims: The objective of this research is to predict the covid-19 cases in India based on the machine learning approaches. Background: Covid-19, a respiratory disease caused by one of the coronavirus family members, has led to a pandemic situation worldwide in 2020. This virus was detected firstly in Wuhan city of China in December 2019. This viral disease has taken less than three months to spread across the globe. Objective: In this paper, we proposed a regression model based on the Support vector machine (SVM) to forecast the number of deaths, the number of recovered cases, and total confirmed cases for the next 30 days. Method: For prediction, the data is collected from Github and the ministry of India's health and family welfare from March 14, 2020, to December 3, 2020. The model has been designed in Python 3.6 in Anaconda to forecast the forecasting value of corona trends until September 21, 2020. The proposed methodology is based on the prediction of values using SVM based regression model with polynomial, linear, rbf kernel. The dataset has been divided into train and test datasets with 40% and 60% test size and verified with real data. The model performance parameters are evaluated as a mean square error, mean absolute error, and percentage accuracy. Results and Conclusion: The results show that the polynomial model has obtained 95 % above accuracy score, linear scored above 90%, and rbf scored above 85% in predicting cumulative death, conformed cases, and recovered cases.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Javad Ansarifar ◽  
Lizhi Wang ◽  
Sotirios V. Archontoulis

AbstractCrop yield prediction is crucial for global food security yet notoriously challenging due to multitudinous factors that jointly determine the yield, including genotype, environment, management, and their complex interactions. Integrating the power of optimization, machine learning, and agronomic insight, we present a new predictive model (referred to as the interaction regression model) for crop yield prediction, which has three salient properties. First, it achieved a relative root mean square error of 8% or less in three Midwest states (Illinois, Indiana, and Iowa) in the US for both corn and soybean yield prediction, outperforming state-of-the-art machine learning algorithms. Second, it identified about a dozen environment by management interactions for corn and soybean yield, some of which are consistent with conventional agronomic knowledge whereas some others interactions require additional analysis or experiment to prove or disprove. Third, it quantitatively dissected crop yield into contributions from weather, soil, management, and their interactions, allowing agronomists to pinpoint the factors that favorably or unfavorably affect the yield of a given location under a given weather and management scenario. The most significant contribution of the new prediction model is its capability to produce accurate prediction and explainable insights simultaneously. This was achieved by training the algorithm to select features and interactions that are spatially and temporally robust to balance prediction accuracy for the training data and generalizability to the test data.


Author(s):  
Hossein Safarzadeh ◽  
Marco Leonesio ◽  
Giacomo Bianchi ◽  
Michele Monno

AbstractThis work proposes a model for suggesting optimal process configuration in plunge centreless grinding operations. Seven different approaches were implemented and compared: first principles model, neural network model with one hidden layer, support vector regression model with polynomial kernel function, Gaussian process regression model and hybrid versions of those three models. The first approach is based on an enhancement of the well-known numerical process simulation of geometrical instability. The model takes into account raw workpiece profile and possible wheel-workpiece loss of contact, which introduces an inherent limitation on the resulting profile waviness. Physical models, because of epistemic errors due to neglected or oversimplified functional relationships, can be too approximated for being considered in industrial applications. Moreover, in deterministic models, uncertainties affecting the various parameters are not explicitly considered. Complexity in centreless grinding models arises from phenomena like contact length dependency on local compliance, contact force and grinding wheel roughness, unpredicted material properties of the grinding wheel and workpiece, precision of the manual setup done by the operator, wheel wear and nature of wheel wear. In order to improve the overall model prediction accuracy and allow automated continuous learning, several machine learning techniques have been investigated: a Bayesian regularized neural network, an SVR model and a GPR model. To exploit the a priori knowledge embedded in physical models, hybrid models are proposed, where neural network, SVR and GPR models are fed by the nominal process parameters enriched with the roundness predicted by the first principle model. Those hybrid models result in an improved prediction capability.


Sign in / Sign up

Export Citation Format

Share Document