Machine learning based on extended generalized linear model applied in mixture experiments

Author(s):  
Gilberto Rodrigues Liska ◽  
Marcelo Ângelo Cirillo ◽  
Fortunato Silva de Menezes ◽  
Julio Silvio de Sousa Bueno Filho
2020 ◽  
Vol 14 (03) ◽  
Author(s):  
Mateus Schuh ◽  
José Augusto Spiazzi Favarin ◽  
Juliana Marchesan ◽  
Elisiane Alba ◽  
Elias Fernando Berra ◽  
...  

The study examines the historical data of about 4700 air crashes all over the world since the first recorded air crash of 1908. Given the immense impact on human beings as well as companies, the study aimed at utilizing Machine Learning principles for predicting fatalities. The train-test partition used was 75-25. Employing the IBM SPSS Modeler, the machine learning models used included CHAID model, Neural Network, Generalized Linear Model, XGBoost, Random Trees and the Ensemble model to predict fatalities in air crashes. The best results (90.6% accuracy) were achieved through Neural Network with one hidden layer. The results presented also include comparison of the predicted versus observed results for the test data.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257027
Author(s):  
Hing Ling Chan ◽  
Minling Pan

Fishing trip cost is an important element in evaluating economic performance of fisheries, assessing economic effects from fisheries management alternatives, and serving as input for ecosystem and bioeconomic modeling. However, many fisheries have limited trip-level data due to low observer coverage. This article introduces a generalized linear model (GLM) utilizing machine learning (ML) techniques to develop a modeling approach to estimate the functional forms and predict the fishing trip costs of unsampled trips. GLM with Lasso regularization and ML cross-validation of model are done simultaneously for predictor selection and evaluation of the predictive power of a model. This modeling approach is applied to estimate the trip-level fishing costs using the empirical sampled trip costs and the associated trip-level fishing operational data and vessel characteristics in the Hawaii and American Samoa longline fisheries. Using this approach to build models is particularly important when there is no strong theoretical guideline on predictor selection. Also, the modeling approach addresses the issue of skewed trip cost data and provides predictive power measurement, compared with the previous modeling efforts in trip cost estimation for the Hawaii longline fishery. As a result, fishing trip costs for all trips in the fishery can be estimated. Lastly, this study applies the estimated trip cost model to conduct an empirical analysis to evaluate the impacts on trip costs due to spatial regulations in the Hawaii longline fishery. The results show that closing the Western and Central Pacific Ocean (WCPO) could induce an average 14% increase in fishing trip costs, while the trip cost impacts of the Eastern Pacific Ocean (EPO) closures could be lower.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Hamid Gholami ◽  
Aliakbar Mohammadifar ◽  
Dieu Tien Bui ◽  
Adrian L. Collins

AbstractLand susceptibility to wind erosion hazard in Isfahan province, Iran, was mapped by testing 16 advanced regression-based machine learning methods: Robust linear regression (RLR), Cforest, Non-convex penalized quantile regression (NCPQR), Neural network with feature extraction (NNFE), Monotone multi-layer perception neural network (MMLPNN), Ridge regression (RR), Boosting generalized linear model (BGLM), Negative binomial generalized linear model (NBGLM), Boosting generalized additive model (BGAM), Spline generalized additive model (SGAM), Spike and slab regression (SSR), Stochastic gradient boosting (SGB), support vector machine (SVM), Relevance vector machine (RVM) and the Cubist and Adaptive network-based fuzzy inference system (ANFIS). Thirteen factors controlling wind erosion were mapped, and multicollinearity among these factors was quantified using the tolerance coefficient (TC) and variance inflation factor (VIF). Model performance was assessed by RMSE, MAE, MBE, and a Taylor diagram using both training and validation datasets. The result showed that five models (MMLPNN, SGAM, Cforest, BGAM and SGB) are capable of delivering a high prediction accuracy for land susceptibility to wind erosion hazard. DEM, precipitation, and vegetation (NDVI) are the most critical factors controlling wind erosion in the study area. Overall, regression-based machine learning models are efficient techniques for mapping land susceptibility to wind erosion hazards.


Author(s):  
Bo Lan ◽  
Perry Haaland ◽  
Ashok Krishnamurthy ◽  
David B. Peden ◽  
Patrick L. Schmitt ◽  
...  

ICEES (Integrated Clinical and Environmental Exposures Service) provides a disease-agnostic, regulatory-compliant approach for openly exposing and analyzing clinical data that have been integrated at the patient level with environmental exposures data. ICEES is equipped with basic features to support exploratory analysis using statistical approaches, such as bivariate chi-square tests. We recently developed a method for using ICEES to generate multivariate tables for subsequent application of machine learning and statistical models. The objective of the present study was to use this approach to identify predictors of asthma exacerbations through the application of three multivariate methods: conditional random forest, conditional tree, and generalized linear model. Among seven potential predictor variables, we found five to be of significant importance using both conditional random forest and conditional tree: prednisone, race, airborne particulate exposure, obesity, and sex. The conditional tree method additionally identified several significant two-way and three-way interactions among the same variables. When we applied a generalized linear model, we identified four significant predictor variables, namely prednisone, race, airborne particulate exposure, and obesity. When ranked in order by effect size, the results were in agreement with the results from the conditional random forest and conditional tree methods as well as the published literature. Our results suggest that the open multivariate analytic capabilities provided by ICEES are valid in the context of an asthma use case and likely will have broad value in advancing open research in environmental and public health.


Sign in / Sign up

Export Citation Format

Share Document