scholarly journals Teasing Apart Silvopasture System Components Using Machine Learning for Optimization

Soil Systems ◽  
2021 ◽  
Vol 5 (3) ◽  
pp. 41
Author(s):  
Tulsi P. Kharel ◽  
Amanda J. Ashworth ◽  
Phillip R. Owens ◽  
Dirk Philipp ◽  
Andrew L. Thomas ◽  
...  

Silvopasture systems combine tree and livestock production to minimize market risk and enhance ecological services. Our objective was to explore and develop a method for identifying driving factors linked to productivity in a silvopastoral system using machine learning. A multi-variable approach was used to detect factors that affect system-level output (i.e., plant production (tree and forage), soil factors, and animal response based on grazing preference). Variables from a three-year (2017–2019) grazing study, including forage, tree, soil, and terrain attribute parameters, were analyzed. Hierarchical variable clustering and random forest model selected 10 important variables for each of four major clusters. A stepwise multiple linear regression and regression tree approach was used to predict cattle grazing hours per animal unit (h ha−1 AU−1) using 40 variables (10 per cluster) selected from 130 total variables. Overall, the variable ranking method selected more weighted variables for systems-level analysis. The regression tree performed better than stepwise linear regression for interpreting factor-level effects on animal grazing preference. Cattle were more likely to graze forage on soils with Cd levels <0.04 mg kg−1 (126% greater grazing hours per AU), soil Cr <0.098 mg kg−1 (108%), and a SAGA wetness index of <2.7 (57%). Cattle also preferred grazing (88%) native grasses compared to orchardgrass (Dactylis glomerata L.). The result shows water flow within the landscape position (wetness index), and associated metals distribution may be used as an indicator of animal grazing preference. Overall, soil nutrient distribution patterns drove grazing response, although animal grazing preference was also influenced by aboveground (forage and tree), soil, and landscape attributes. Machine learning approaches helped explain pasture use and overall drivers of grazing preference in a multifunctional system.

Long term global warming prediction can be of major importance in various sectors like climate related studies, agricultural, energy, medical and many more. This paper evaluates the performance of several Machine Learning algorithm (Linear Regression, Multi-Regression tree, Support Vector Regression (SVR), lasso) in problem of annual global warming prediction, from previous measured values over India. The first challenge dwells on creating a reliable, efficient statistical reliable data model on large data set and accurately capture relationship between average annual temperature and potential factors such as concentration of carbon dioxide, methane, nitrous oxide. The data is predicted and forecasted by linear regression because it is obtaining the highest accuracy for greenhouse gases and temperature among all the technologies which can be used. It was also found that CO2 is the plays the role of major contributor temperature change, followed by CH4, then by N20. After seeing the analysed and predicted data of the greenhouse gases and temperature, the global warming can be reduced comparatively within few years. The reduction of global temperature can help the whole world because not only human but also different animals are suffering from the global temperature.


2020 ◽  
Author(s):  
Yihuan Huang ◽  
Amanda Kay Montoya

Machine learning methods are being increasingly adopted in psychological research. Lasso performs variable selection and regularization, and is particularly appealing to psychology researchers because of its connection to linear regression. Researchers conflate properties of linear regression with properties of lasso; however, we demonstrate that this is not the case for models with categorical predictors. Specifically, the coding strategy used for categorical predictors impacts lasso’s performance but not linear regression. Group lasso is an alternative to lasso for models with categorical predictors. We demonstrate the inconsistency of lasso and group lasso models using a real data set: lasso performs different variable selection and has different prediction accuracy depending on the coding strategy, and group lasso performs consistent variable selection but has different prediction accuracy. Additionally, group lasso may include many predictors when very few are needed, leading to overfitting. Using Monte Carlo simulation, we show that categorical variables with one group mean differing from all others (one dominant group) are more likely to be included in the model by group lasso than lasso, leading to overfitting. This effect is strongest when the mean difference is large and there are many categories. Researchers primarily focus on the similarity between linear regression and lasso, but pay little attention to their different properties. This project demonstrates that when using lasso and group lasso, the effect of coding strategies should be considered. We conclude with recommended solutions to this issue and future directions of exploration to improve implementation of machine learning approaches in psychological science.


2021 ◽  
pp. 1-33
Author(s):  
Stéphane Loisel ◽  
Pierrick Piette ◽  
Cheng-Hsien Jason Tsai

Abstract Modeling policyholders’ lapse behaviors is important to a life insurer, since lapses affect pricing, reserving, profitability, liquidity, risk management, and the solvency of the insurer. In this paper, we apply two machine learning methods to lapse modeling. Then, we evaluate the performance of these two methods along with two popular statistical methods by means of statistical accuracy and profitability measure. Moreover, we adopt an innovative point of view on the lapse prediction problem that comes from churn management. We transform the classification problem into a regression question and then perform optimization, which is new to lapse risk management. We apply the aforementioned four methods to a large real-world insurance dataset. The results show that Extreme Gradient Boosting (XGBoost) and support vector machine outperform logistic regression (LR) and classification and regression tree with respect to statistic accuracy, while LR performs as well as XGBoost in terms of retention gains. This highlights the importance of a proper validation metric when comparing different methods. The optimization after the transformation brings out significant and consistent increases in economic gains. Therefore, the insurer should conduct optimization on its economic objective to achieve optimal lapse management.


PLoS ONE ◽  
2021 ◽  
Vol 16 (10) ◽  
pp. e0258125
Author(s):  
Enes Gul ◽  
Mir Jafar Sadegh Safari ◽  
Ali Torabi Haghighi ◽  
Ali Danandeh Mehr

To reduce the problem of sedimentation in open channels, calculating flow velocity is critical. Undesirable operating costs arise due to sedimentation problems. To overcome these problems, the development of machine learning based models may provide reliable results. Recently, numerous studies have been conducted to model sediment transport in non-deposition condition however, the main deficiency of the existing studies is utilization of a limited range of data in model development. To tackle this drawback, six data sets with wide ranges of pipe size, volumetric sediment concentration, channel bed slope, sediment size and flow depth are used for the model development in this study. Moreover, two tree-based algorithms, namely M5 rule tree (M5RT) and M5 regression tree (M5RGT) are implemented, and results are compared to the traditional regression equations available in the literature. The results show that machine learning approaches outperform traditional regression models. The tree-based algorithms, M5RT and M5RGT, provided satisfactory results in contrast to their regression-based alternatives with RMSE = 1.184 and RMSE = 1.071, respectively. In order to recommend a practical solution, the tree structure algorithms are supplied to compute sediment transport in an open channel flow.


2021 ◽  
Author(s):  
Seth Margolis ◽  
Jacob Elder ◽  
Brent Hughes ◽  
Sonja Lyubomirsky

What are the most important predictors of subjective well-being? Using a nationally representative publicly available dataset from the Midlife in the United States project (N = 4,378), we applied linear regression, which often relies on assumptions of linearity and a priori interactions, and advanced machine learning approaches, which maximize prediction by thoroughly exploring nonlinear effects and higher-order interactions, to determine the ordering and characteristics of predictors of well-being. Advanced machine learning models generally did not predict well-being more accurately than did regression models, suggesting that many predictors of well-being may be linear and non-interactive. Consistent with this implication, the introduction of product and squared terms in regression models improved prediction, but only nominally. Our findings replicated previous research, with sociability, physical health, disengagement from goals, sex life quality, wealth, and religious activity emerging as the strongest predictors of well-being, and demographic factors emerging as relatively weak predictors. Furthermore, self-reported “aches” (the strongest “objective” predictor of well-being), stress reactivity, and disengagement negatively predicted well-being, reinforcing the role of stress in psychological maladjustment. Finally, unlike prior research, control over one’s life—and control over financial and work matters in particular—strongly predicted well-being.


2019 ◽  
Vol 70 (3) ◽  
pp. 214-224
Author(s):  
Bui Ngoc Dung ◽  
Manh Dzung Lai ◽  
Tran Vu Hieu ◽  
Nguyen Binh T. H.

Video surveillance is emerging research field of intelligent transport systems. This paper presents some techniques which use machine learning and computer vision in vehicles detection and tracking. Firstly the machine learning approaches using Haar-like features and Ada-Boost algorithm for vehicle detection are presented. Secondly approaches to detect vehicles using the background subtraction method based on Gaussian Mixture Model and to track vehicles using optical flow and multiple Kalman filters were given. The method takes advantages of distinguish and tracking multiple vehicles individually. The experimental results demonstrate high accurately of the method.


2017 ◽  
Author(s):  
Sabrina Jaeger ◽  
Simone Fulle ◽  
Samo Turk

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.


Sign in / Sign up

Export Citation Format

Share Document