scholarly journals Statistical Seasonal Prediction Based on Regularized Regression

2017 ◽  
Vol 30 (4) ◽  
pp. 1345-1361 ◽  
Author(s):  
Timothy DelSole ◽  
Arindam Banerjee

Abstract This paper proposes a regularized regression procedure for finding a predictive relation between one variable and a field of other variables. The procedure estimates a linear prediction model under the constraint that the regression coefficients have smooth spatial structure. The smoothness constraint is imposed using a novel approach based on the eigenvectors of the Laplace operator over the domain, which results in a constrained optimization problem equivalent to either ridge regression or least absolute shrinkage and selection operator (LASSO) regression, which can be solved by standard numerical software. In addition, this paper explores an unconventional procedure whereby regression models are estimated from dynamical model output and then verified against observations—the reverse of the traditional order. The methodology is illustrated by constructing statistical prediction models of summer Texas-area temperature based on concurrent Pacific sea surface temperature (SST). None of the regularized regression models have statistically significant skill when estimated from observations. In contrast, when estimated from dynamical model output, the regression models have skill with respect to dynamical model data because of the substantially larger sample size available from dynamical model output. In addition, the regression models estimated from dynamical model data can predict observed anomalies with significant skill, even though no observations were used directly to estimate the regression models. The results indicate that dynamical models had no significant skill because they could not accurately predict the SST itself, not because they could not capture realistic SST teleconnections.

Author(s):  
Vijay Kumar Dwivedi ◽  
Manoj Madhava Gore

Background: Stock price prediction is a challenging task. The social, economic, political, and various other factors cause frequent abrupt changes in the stock price. This article proposes a historical data-based ensemble system to predict the closing stock price with higher accuracy and consistency over the existing stock price prediction systems. Objective: The primary objective of this article is to predict the closing price of a stock for the next trading in more accurate and consistent manner over the existing methods employed for the stock price prediction. Method: The proposed system combines various machine learning-based prediction models employing least absolute shrinkage and selection operator (LASSO) regression regularization technique to enhance the accuracy of stock price prediction system as compared to any one of the base prediction models. Results: The analysis of results for all the eleven stocks (listed under Information Technology sector on the Bombay Stock Exchange, India) reveals that the proposed system performs best (on all defined metrics of the proposed system) for training datasets and test datasets comprising of all the stocks considered in the proposed system. Conclusion: The proposed ensemble model consistently predicts stock price with a high degree of accuracy over the existing methods used for the prediction.


2016 ◽  
Vol 16 (2) ◽  
pp. 43-50 ◽  
Author(s):  
Samander Ali Malik ◽  
Assad Farooq ◽  
Thomas Gereke ◽  
Chokri Cherif

Abstract The present research work was carried out to develop the prediction models for blended ring spun yarn evenness and tensile parameters using artificial neural networks (ANNs) and multiple linear regression (MLR). Polyester/cotton blend ratio, twist multiplier, back roller hardness and break draft ratio were used as input parameters to predict yarn evenness in terms of CVm% and yarn tensile properties in terms of tenacity and elongation. Feed forward neural networks with Bayesian regularisation support were successfully trained and tested using the available experimental data. The coefficients of determination of ANN and regression models indicate that there is a strong correlation between the measured and predicted yarn characteristics with an acceptable mean absolute error values. The comparative analysis of two modelling techniques shows that the ANNs perform better than the MLR models. The relative importance of input variables was determined using rank analysis through input saliency test on optimised ANN models and standardised coefficients of regression models. These models are suitable for yarn manufacturers and can be used within the investigated knowledge domain.


1996 ◽  
Vol 29 (1) ◽  
pp. 5090-5095
Author(s):  
Vikram Krishnamurthy ◽  
H. Vincent Poor

2021 ◽  
Vol 156 (A4) ◽  
Author(s):  
N Hifi ◽  
N Barltrop

This paper applies a newly developed methodology to calibrate the corrosion model within a structural reliability analysis. The methodology combines data from experience (measurements and expert judgment) and prediction models to adjust the structural reliability models. Two corrosion models published in the literature have been used to demonstrate the technique used for the model calibration. One model is used as a prediction for a future degradation and a second one to represent the inspection recorded data. The results of the calibration process are presented and discussed.


2021 ◽  
Vol 42 (Supplement_1) ◽  
pp. S33-S34
Author(s):  
Morgan A Taylor ◽  
Randy D Kearns ◽  
Jeffrey E Carter ◽  
Mark H Ebell ◽  
Curt A Harris

Abstract Introduction A nuclear disaster would generate an unprecedented volume of thermal burn patients from the explosion and subsequent mass fires (Figure 1). Prediction models characterizing outcomes for these patients may better equip healthcare providers and other responders to manage large scale nuclear events. Logistic regression models have traditionally been employed to develop prediction scores for mortality of all burn patients. However, other healthcare disciplines have increasingly transitioned to machine learning (ML) models, which are automatically generated and continually improved, potentially increasing predictive accuracy. Preliminary research suggests ML models can predict burn patient mortality more accurately than commonly used prediction scores. The purpose of this study is to examine the efficacy of various ML methods in assessing thermal burn patient mortality and length of stay in burn centers. Methods This retrospective study identified patients with fire/flame burn etiologies in the National Burn Repository between the years 2009 – 2018. Patients were randomly partitioned into a 67%/33% split for training and validation. A random forest model (RF) and an artificial neural network (ANN) were then constructed for each outcome, mortality and length of stay. These models were then compared to logistic regression models and previously developed prediction tools with similar outcomes using a combination of classification and regression metrics. Results During the study period, 82,404 burn patients with a thermal etiology were identified in the analysis. The ANN models will likely tend to overfit the data, which can be resolved by ending the model training early or adding additional regularization parameters. Further exploration of the advantages and limitations of these models is forthcoming as metric analyses become available. Conclusions In this proof-of-concept study, we anticipate that at least one ML model will predict the targeted outcomes of thermal burn patient mortality and length of stay as judged by the fidelity with which it matches the logistic regression analysis. These advancements can then help disaster preparedness programs consider resource limitations during catastrophic incidents resulting in burn injuries.


2021 ◽  
Author(s):  
Lance F Merrick ◽  
Dennis N Lozada ◽  
Xianming Chen ◽  
Arron H Carter

Most genomic prediction models are linear regression models that assume continuous and normally distributed phenotypes, but responses to diseases such as stripe rust (caused by Puccinia striiformis f. sp. tritici) are commonly recorded in ordinal scales and percentages. Disease severity (SEV) and infection type (IT) data in germplasm screening nurseries generally do not follow these assumptions. On this regard, researchers may ignore the lack of normality, transform the phenotypes, use generalized linear models, or use supervised learning algorithms and classification models with no restriction on the distribution of response variables, which are less sensitive when modeling ordinal scores. The goal of this research was to compare classification and regression genomic selection models for skewed phenotypes using stripe rust SEV and IT in winter wheat. We extensively compared both regression and classification prediction models using two training populations composed of breeding lines phenotyped in four years (2016-2018, and 2020) and a diversity panel phenotyped in four years (2013-2016). The prediction models used 19,861 genotyping-by-sequencing single-nucleotide polymorphism markers. Overall, square root transformed phenotypes using rrBLUP and support vector machine regression models displayed the highest combination of accuracy and relative efficiency across the regression and classification models. Further, a classification system based on support vector machine and ordinal Bayesian models with a 2-Class scale for SEV reached the highest class accuracy of 0.99. This study showed that breeders can use linear and non-parametric regression models within their own breeding lines over combined years to accurately predict skewed phenotypes.


2021 ◽  
Author(s):  
Shaomei Yang ◽  
Haoyue Wu

Abstract PM2.5 has a significant negative impact on human health and atmospheric quality, and accurate prediction of its concentration is necessary. PM2.5 concentration is influenced by a combination of factors from both meteorological conditions and air quality. It is essential to identify the significant factors influencing PM2.5 concentrations in the prediction process. To address this issue, this paper proposes the quantile regression (QR) model based on the least absolute shrinkage and selection operator (LASSO), combined with kernel density estimation (KDE) for probabilistic density prediction of PM2.5 concentrations. The model uses LASSO regression to select the influential factors, and then the quartiles of daily PM2.5 concentrations obtained using the QR model are imported into the KDE model to obtain the probability density curves of PM2.5 concentrations. In this paper, empirical analysis is performed with the data sets of Beijing, China, and Jinan, China, and the accuracy of the model is evaluated using the mean absolute percentage error(MAPE) and the relative mean square error (RMSE). The simulation results reveal that the LASSSO-QR-KDE model has a higher accuracy than the traditional prediction models and the currently used research models. The model provides a novel and excellent tool for policy makers to predict PM2.5 concentrations.


2011 ◽  
Vol 24 (4) ◽  
pp. 567-573 ◽  
Author(s):  
Sung-Min Myoung ◽  
Doo-Jin Lee ◽  
Hwa-Soo Kim ◽  
Jin-Nam Jo

Sign in / Sign up

Export Citation Format

Share Document