scholarly journals Corn Yield Prediction With Ensemble CNN-DNN

2021 ◽  
Vol 12 ◽  
Author(s):  
Mohsen Shahhosseini ◽  
Guiping Hu ◽  
Saeed Khaki ◽  
Sotirios V. Archontoulis

We investigate the predictive performance of two novel CNN-DNN machine learning ensemble models in predicting county-level corn yields across the US Corn Belt (12 states). The developed data set is a combination of management, environment, and historical corn yields from 1980 to 2019. Two scenarios for ensemble creation are considered: homogenous and heterogenous ensembles. In homogenous ensembles, the base CNN-DNN models are all the same, but they are generated with a bagging procedure to ensure they exhibit a certain level of diversity. Heterogenous ensembles are created from different base CNN-DNN models which share the same architecture but have different hyperparameters. Three types of ensemble creation methods were used to create several ensembles for either of the scenarios: Basic Ensemble Method (BEM), Generalized Ensemble Method (GEM), and stacked generalized ensembles. Results indicated that both designed ensemble types (heterogenous and homogenous) outperform the ensembles created from five individual ML models (linear regression, LASSO, random forest, XGBoost, and LightGBM). Furthermore, by introducing improvements over the heterogenous ensembles, the homogenous ensembles provide the most accurate yield predictions across US Corn Belt states. This model could make 2019 yield predictions with a root mean square error of 866 kg/ha, equivalent to 8.5% relative root mean square and could successfully explain about 77% of the spatio-temporal variation in the corn grain yields. The significant predictive power of this model can be leveraged for designing a reliable tool for corn yield prediction which will in turn assist agronomic decision makers.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mohsen Shahhosseini ◽  
Guiping Hu ◽  
Isaiah Huber ◽  
Sotirios V. Archontoulis

AbstractThis study investigates whether coupling crop modeling and machine learning (ML) improves corn yield predictions in the US Corn Belt. The main objectives are to explore whether a hybrid approach (crop modeling + ML) would result in better predictions, investigate which combinations of hybrid models provide the most accurate predictions, and determine the features from the crop modeling that are most effective to be integrated with ML for corn yield prediction. Five ML models (linear regression, LASSO, LightGBM, random forest, and XGBoost) and six ensemble models have been designed to address the research question. The results suggest that adding simulation crop model variables (APSIM) as input features to ML models can decrease yield prediction root mean squared error (RMSE) from 7 to 20%. Furthermore, we investigated partial inclusion of APSIM features in the ML prediction models and we found soil moisture related APSIM variables are most influential on the ML predictions followed by crop-related and phenology-related variables. Finally, based on feature importance measure, it has been observed that simulated APSIM average drought stress and average water table depth during the growing season are the most important APSIM inputs to ML. This result indicates that weather information alone is not sufficient and ML models need more hydrological inputs to make improved yield predictions.


2017 ◽  
Vol 48 (1) ◽  
Author(s):  
Josana Andreia Langner ◽  
Nereu Augusto Streck ◽  
Angelica Durigon ◽  
Stefanía Dalmolin da Silva ◽  
Isabel Lago ◽  
...  

ABSTRACT: The objective of this study was to compare the simulations of leaf appearance of landrace and improved maize cultivars using the CSM-CERES-Maize (linear) and the Wang and Engel models (nonlinear). The coefficients of the models were calibrated using a data set of total leaf number collected in the 11/04/2013 sowing date for the landrace varieties ‘Cinquentinha’ and ‘Bico de Ouro’ and the simple hybrid ‘AS 1573PRO’. For the ‘BRS Planalto’ variety, model coefficients were estimated with data from 12/13/2014 sowing date. Evaluation of the models was with independent data sets collected during the growing seasons of 2013/2014 (Experiment 1) and 2014/2015 (Experiment 2) in Santa Maria, RS, Brazil. Total number of leaves for both landrace and improved maize varieties was better estimated with the Wang and Engel model, with a root mean square error of 1.0 leaf, while estimations with the CSM-CERES-Maize model had a root mean square error of 1.5 leaf.


2002 ◽  
Vol 6 (4) ◽  
pp. 685-694 ◽  
Author(s):  
M. J. Hall ◽  
A. W. Minns ◽  
A. K. M. Ashrafuzzaman

Abstract. Flood quantile estimation for ungauged catchment areas continues to be a routine problem faced by the practising Engineering Hydrologist, yet the hydrometric networks in many countries are reducing rather than expanding. The result is an increasing reliance on methods for regionalising hydrological variables. Among the most widely applied techniques is the Method of Residuals, an iterative method of classifying catchment areas by their geographical proximity based upon the application of Multiple Linear Regression Analysis (MLRA). Alternative classification techniques, such as cluster analysis, have also been applied but not on a routine basis. However, hydrological regionalisation can also be regarded as a problem in data mining — a search for useful knowledge and models embedded within large data sets. In particular, Artificial Neural Networks (ANNs) can be applied both to classify catchments according to their geomorphological and climatic characteristics and to relate flow quantiles to those characteristics. This approach has been applied to three data sets from the south-west of England and Wales; to England, Wales and Scotland (EWS); and to the islands of Java and Sumatra in Indonesia. The results demonstrated that hydrologically plausible clusters can be obtained under contrasting conditions of climate. The four classes of catchment found in the EWS data set were found to be compatible with the three classes identified in the earlier study of a smaller data set from south-west England and Wales. Relationships for the parameters of the at-site distribution of annual floods can be developed that are superior to those based upon MLRA in terms of root mean square errors of validation data sets. Indeed, the results from Java and Sumatra demonstrate a clear advantage in reduced root mean square error of the dependent flow variable through recognising the presence of three classes of catchment. Wider evaluation of this methodology is recommended. Keywords: regionalisation, floods, catchment characteristics, data mining, artificial neural networks


2005 ◽  
Vol 68 (11) ◽  
pp. 2301-2309 ◽  
Author(s):  
DANILO T. CAMPOS ◽  
BRADLEY P. MARKS ◽  
MARK R. POWELL ◽  
MARK L. TAMPLIN

The robustness of a microbial growth model must be assessed before the model can be applied to new food matrices; therefore, a methodology for quantifying robustness was developed. A robustness index (RI) was computed as the ratio of the standard error of prediction to the standard error of calibration for a given model, where the standard error of calibration was defined as the root mean square error of the growth model against the data (log CFU per gram versus time) used to parameterize the model and the standard error of prediction was defined as the root mean square error of the model against an independent data set. This technique was used to evaluate the robustness of a broth-based model for aerobic growth of Escherichia coli O157:H7 (in the U.S Department of Agriculture Agricultural Research Service Pathogen Modeling Program) in predicting growth in ground beef under different conditions. Comparison against previously published data (132 data sets with 1,178 total data points) from experiments in ground beef at various experimental conditions (4.8 to 45°C and pH 5.5 to 5.9) yielded RI values ranging from 0.11 to 2.99. The estimated overall RI was 1.13. At temperatures between 15 and 40°C, the RI was close to and smaller than 1, indicating that the growth model is relatively robust in that temperature range. However, the RI also was related (P < 0.05) to temperature. By quantifying the predictive accuracy relative to the expected accuracy, the RI could be a useful tool for comparing various models under different conditions.


2019 ◽  
Vol 50 (3) ◽  
pp. 120-126
Author(s):  
Homayoon Ganji ◽  
Takamitsu Kajisa

Estimation of reference evapotranspiration (ET0) with the Food and Agricultural Organisation (FAO) Penman-Monteith model requires temperature, relative humidity, solar radiation, and wind speed data. The lack of availability of the complete data set at some meteorological stations is a severe restriction for the application of this model. To overcome this problem, ET0 can be calculated using alternative data, which can be obtained via procedures proposed in FAO paper No.56. To confirm the validity of reference evapotranspiration calculated using alternative data (ET0(Alt)), the root mean square error (RMSE) needs to be estimated; lower values of RMSE indicate better validity. However, RMSE does not explain the mechanism of error formation in a model equation; explaining the mechanism of error formation is useful for future model improvement. Furthermore, for calculating RMSE, ET0 calculations based on both complete and alternative data are necessary. An error propagation approach was introduced in this study both for estimating RMSE and for explaining the mechanism of error formation by using data from a 30-year period from 48 different locations in Japan. From the results, RMSE was confirmed to be proportional to the value produced by the error propagation approach (ΔET0). Therefore, the error propagation approach is applicable to estimating the RMSE of ET0(Alt) in the range of 12%. Furthermore, the error of ET0(Alt) is not only related to the variables’ uncertainty but also to the combination of the variables in the equation.


2017 ◽  
Author(s):  
Jan H Jensen

This document is my attempt at distilling some of the information in two papers published by Anthony Nicholls (J. Comput. Aided Mol. Des. 2014, 28, 887; ibid 2016, 30, 103). Anthony also very kindly provided some new equations, not found in the papers, in response to my questions. The paper describes how one determines whether the difference in accuracy of two methods in predicting some properties for the same data set is statistically significant using root-mean-square errors, mean absolute errors, mean errors, and Pearsons r values.


Author(s):  
Joe Wan ◽  
Michael Qu ◽  
Xianjun Hao ◽  
Ray Motha ◽  
John J. Qu

2017 ◽  
Vol 8 (2) ◽  
pp. 194
Author(s):  
Augustine C. Arize ◽  
Charles J. Berendt ◽  
Giuliana Campanelli Andreopoulos ◽  
Ioannis N. Kallianiotis ◽  
John Malindretos

This paper uses a large variety of different models and examines the predictive performance of these exchange rate models by applying parametric and non-parametric techniques. For forecasting, we will choose that predictor with the smallest root mean square forecast error (RMSE). The results show that the better model is equation (34), but none of them gives a perfect forecast. At the end, error correction versions of the models will be fit so that plausible long-run elasticities can be imposed on the fundamental variables of each model.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Zehui Jiang ◽  
Chao Liu ◽  
Baskar Ganapathysubramanian ◽  
Dermot J. Hayes ◽  
Soumik Sarkar

Abstract Maize (corn) is the dominant grain grown in the world. Total maize production in 2018 equaled 1.12 billion tons. Maize is used primarily as an animal feed in the production of eggs, dairy, pork and chicken. The US produces 32% of the world’s maize followed by China at 22% and Brazil at 9% (https://apps.fas.usda.gov/psdonline/app/index.html#/app/home). Accurate national-scale corn yield prediction critically impacts mercantile markets through providing essential information about expected production prior to harvest. Publicly available high-quality corn yield prediction can help address emergent information asymmetry problems and in doing so improve price efficiency in futures markets. We build a deep learning model to predict corn yields, specifically focusing on county-level prediction across 10 states of the Corn-Belt in the United States, and pre-harvest prediction with monthly updates from August. The results show promising predictive power relative to existing survey-based methods and set the foundation for a publicly available county yield prediction effort that complements existing public forecasts.


Sign in / Sign up

Export Citation Format

Share Document