scholarly journals Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models

2014 ◽  
Vol 26 (2) ◽  
pp. 796-808 ◽  
Author(s):  
Peter C Austin ◽  
Ewout W Steyerberg

We conducted an extensive set of empirical analyses to examine the effect of the number of events per variable (EPV) on the relative performance of three different methods for assessing the predictive accuracy of a logistic regression model: apparent performance in the analysis sample, split-sample validation, and optimism correction using bootstrap methods. Using a single dataset of patients hospitalized with heart failure, we compared the estimates of discriminatory performance from these methods to those for a very large independent validation sample arising from the same population. As anticipated, the apparent performance was optimistically biased, with the degree of optimism diminishing as the number of events per variable increased. Differences between the bootstrap-corrected approach and the use of an independent validation sample were minimal once the number of events per variable was at least 20. Split-sample assessment resulted in too pessimistic and highly uncertain estimates of model performance. Apparent performance estimates had lower mean squared error compared to split-sample estimates, but the lowest mean squared error was obtained by bootstrap-corrected optimism estimates. For bias, variance, and mean squared error of the performance estimates, the penalty incurred by using split-sample validation was equivalent to reducing the sample size by a proportion equivalent to the proportion of the sample that was withheld for model validation. In conclusion, split-sample validation is inefficient and apparent performance is too optimistic for internal validation of regression-based prediction models. Modern validation methods, such as bootstrap-based optimism correction, are preferable. While these findings may be unsurprising to many statisticians, the results of the current study reinforce what should be considered good statistical practice in the development and validation of clinical prediction models.

2013 ◽  
Vol 17 (11) ◽  
pp. 4713-4728 ◽  
Author(s):  
S. Terzer ◽  
L. I. Wassenaar ◽  
L. J. Araguás-Araguás ◽  
P. K. Aggarwal

Abstract. A regionalized cluster-based water isotope prediction (RCWIP) approach, based on the Global Network of Isotopes in Precipitation (GNIP), was demonstrated for the purposes of predicting point- and large-scale spatio-temporal patterns of the stable isotope composition (δ2H, δ18O) of precipitation around the world. Unlike earlier global domain and fixed regressor models, RCWIP predefined 36 climatic cluster domains and tested all model combinations from an array of climatic and spatial regressor variables to obtain the best predictive approach to each cluster domain, as indicated by root-mean-squared error (RMSE) and variogram analysis. Fuzzy membership fractions were thereafter used as the weights to seamlessly amalgamate results of the optimized climatic zone prediction models into a single predictive mapping product, such as global or regional amount-weighted mean annual, mean monthly, or growing-season δ18O/δ2H in precipitation. Comparative tests revealed the RCWIP approach outperformed classical global-fixed regression–interpolation-based models more than 67% of the time, and clearly improved upon predictive accuracy and precision. All RCWIP isotope mapping products are available as gridded GeoTIFF files from the IAEA website (www.iaea.org/water) and are for use in hydrology, climatology, food authenticity, ecology, and forensics.


2020 ◽  
Vol 74 (2) ◽  
pp. 159-191
Author(s):  
Rok Blagus ◽  
Jelle J. Goeman

2009 ◽  
Vol 24 (5) ◽  
pp. 1401-1415 ◽  
Author(s):  
Elizabeth E. Ebert ◽  
William A. Gallus

Abstract The contiguous rain area (CRA) method for spatial forecast verification is a features-based approach that evaluates the properties of forecast rain systems, namely, their location, size, intensity, and finescale pattern. It is one of many recently developed spatial verification approaches that are being evaluated as part of a Spatial Forecast Verification Methods Intercomparison Project. To better understand the strengths and weaknesses of the CRA method, it has been tested here on a set of idealized geometric and perturbed forecasts with known errors, as well as nine precipitation forecasts from three high-resolution numerical weather prediction models. The CRA method was able to identify the known errors for the geometric forecasts, but only after a modification was introduced to allow nonoverlapping forecast and observed features to be matched. For the perturbed cases in which a radar rain field was spatially translated and amplified to simulate forecast errors, the CRA method also reproduced the known errors except when a high-intensity threshold was used to define the CRA (≥10 mm h−1) and a large translation error was imposed (>200 km). The decomposition of total error into displacement, volume, and pattern components reflected the source of the error almost all of the time when a mean squared error formulation was used, but not necessarily when a correlation-based formulation was used. When applied to real forecasts, the CRA method gave similar results when either best-fit criteria, minimization of the mean squared error, or maximization of the correlation coefficient, was chosen for matching forecast and observed features. The diagnosed displacement error was somewhat sensitive to the choice of search distance. Of the many diagnostics produced by this method, the errors in the mean and peak rain rate between the forecast and observed features showed the best correspondence with subjective evaluations of the forecasts, while the spatial correlation coefficient (after matching) did not reflect the subjective judgments.


2020 ◽  
Author(s):  
Rafael Massahiro Yassue ◽  
José Felipe Gonzaga Sabadin ◽  
Giovanni Galli ◽  
Filipe Couto Alves ◽  
Roberto Fritsche-Neto

AbstractUsually, the comparison among genomic prediction models is based on validation schemes as Repeated Random Subsampling (RRS) or K-fold cross-validation. Nevertheless, the design of training and validation sets has a high effect on the way and subjectiveness that we compare models. Those procedures cited above have an overlap across replicates that might cause an overestimated estimate and lack of residuals independence due to resampling issues and might cause less accurate results. Furthermore, posthoc tests, such as ANOVA, are not recommended due to assumption unfulfilled regarding residuals independence. Thus, we propose a new way to sample observations to build training and validation sets based on cross-validation alpha-based design (CV-α). The CV-α was meant to create several scenarios of validation (replicates x folds), regardless of the number of treatments. Using CV-α, the number of genotypes in the same fold across replicates was much lower than K-fold, indicating higher residual independence. Therefore, based on the CV-α results, as proof of concept, via ANOVA, we could compare the proposed methodology to RRS and K-fold, applying four genomic prediction models with a simulated and real dataset. Concerning the predictive ability and bias, all validation methods showed similar performance. However, regarding the mean squared error and coefficient of variation, the CV-α method presented the best performance under the evaluated scenarios. Moreover, as it has no additional cost nor complexity, it is more reliable and allows the use of non-subjective methods to compare models and factors. Therefore, CV-α can be considered a more precise validation methodology for model selection.


2021 ◽  
Vol 5 (3) ◽  
pp. 439-445
Author(s):  
Dwi Marlina ◽  
Fatchul Arifin

The number of tourists always fluctuates every month, as happened in Kaliadem Merapi, Sleman. The purpose of this research is to develop a prediction system for the number of tourists based on artificial neural networks. This study uses an artificial neural network for data processing methods with the backpropagation algorithm. This study carried out two processes, namely the training process and the testing process with stages consisting of: (1) Collecting input and target data, (2) Normalizing input and target data, (3) Creating artificial neural network architecture by utilizing GUI (Graphical User Interface) Matlab facilities. (4) Conducting training and testing processes, (5) Normalizing predictive data, (6) Analysis of predictive data. In the data analysis, the MSE (Mean Squared Error) value in the training process is 0.0091528 and in the testing process is 0.0051424. Besides, the validity value of predictive accuracy in the testing process is around 91.32%. The resulting MSE (Mean Squared Error) value is relatively small, and the validity value of prediction accuracy is relatively high, so this system can be used to predict the number of tourists in Kaliadem Merapi, Sleman.  


2017 ◽  
Vol 57 (2) ◽  
pp. 229 ◽  
Author(s):  
Farhad Ghafouri-Kesbi ◽  
Ghodratollah Rahimi-Mianji ◽  
Mahmood Honarvar ◽  
Ardeshir Nejati-Javaremi

Three machine learning algorithms: Random Forests (RF), Boosting and Support Vector Machines (SVM) as well as Genomic Best Linear Unbiased Prediction (GBLUP) were used to predict genomic breeding values (GBV) and their predictive performance was compared in different combinations of heritability (0.1, 0.3, and 0.5), number of quantitative trait loci (QTL) (100, 1000) and distribution of QTL effects (normal, uniform and gamma). To this end, a genome comprised of five chromosomes, one Morgan each, was simulated on which 10000 bi-allelic single nucleotide polymorphisms were distributed. Pearson’s correlation between the true and predicted GBV and Mean Squared Error of GBV prediction were used, respectively, as measures of the predictive accuracy and the overall fit achieved with each method. In all methods, an increase in accuracy of prediction was seen following increase in heritability and decrease in the number of QTL. GBLUP had better predictive accuracy than machine learning methods in particular in the scenarios of higher number of QTL and normal and uniform distributions of QTL effects; though in most cases, the differences were non-significant. In the scenarios of small number of QTL and gamma distribution of QTL effects, Boosting outperformed other methods. Regarding Mean Squared Error of GBV prediction, in most cases Boosting outperformed other methods, although the estimates were close to that of GBLUP. Among methods studied, SVM with 0.6 gigabytes (GIG) was the most efficient user of memory followed by RF, GBLUP and Boosting with 1.2-GIG, 1.3-GIG and 2.3-GIG memory requirements, respectively. Regarding computational time, GBLUP, SVM, RF and Boosting ranked first, second, third and last with 10 min, 15 min, 75 min and 600 min, respectively. It was concluded that although stochastic gradient Boosting can predict GBV with high prediction accuracy, significantly longer computational time and memory requirement can be a serious limitation for this algorithm. Therefore, using of other variants of Boosting such as Random Boosting was recommended for genomic evaluation.


2006 ◽  
Vol 3 (1) ◽  
pp. 123 ◽  
Author(s):  
You Hoo Tew ◽  
Enylina Nordin

This study attempts to construct and test financial distress prediction model for Malaysian Companies. The samplefor this study consists of84 companies listed on Bursa Malaysia that became financially distressed in 200/ and 2002 and a matched (by industry and firm size) sample 0/ 84 financially healthy companies. The model is constructed by employing logistic regression analysis based on pooled data of5 years prior tofinancial distress. The model isfirst derived using the estimation sample andthen tested using the validation sample. Adding to the existing research onfinancial distress prediction models, the current model utilizes measures ofshareholders' equity to total liabilities, shareholders' equity to total assets, current liabilities to total assets, total borrowings to total assets andinventory turnover. The results are encouraging, as the model developed/or predicting corporatefinancial distress in Malaysia is reliable up to 5 years prior to financial distress. II is also believed thai the prediction model can be useful to different groups of users such as policy makers, financial institutions, creditors, managers, bankers, investors and shareholders.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Humera Batool ◽  
Lixin Tian

Infectious diseases like COVID-19 spread rapidly and have led to substantial economic loss worldwide, including in Pakistan. The effect of weather on COVID-19 spreading needs more detailed examination, as some studies have claimed to mitigate its spread. COVID-19 was declared a pandemic by WHO and has been reported in about 210 countries worldwide, including Asia, Europe, the USA, and North America. Person-to-person contact and international air travel between the nations were the leading causes behind the spreading of SARS-CoV-2 from its point of origin, besides the natural forces. However, further spread and infection within the community or country can be aided by natural elements, such as the weather. Therefore, the correlation between COVID-19 and temperature can be better elucidated in countries like Pakistan, where SARS-CoV-2 has affected at least 0.37 million people. This study collected Pakistan’s COVID-19 infection and mortality data for ten months (March–December 2020). Related weather parameters, temperature, and humidity were also obtained for the same course of time. The collected data were processed and used to compare the performance of various time series prediction models in terms of mean squared error (MSE), root-mean-squared error (RMSE), and mean absolute percentage error (MAPE). This paper, using the time series model, estimates the effect of humidity, temperature, and other weather parameters on COVID-19 transmission by obtaining the correlation among the total infected cases and the number of deaths and weather variables in a particular region. Results depict that weather parameters hold more influence in evaluating the sum number of cases and deaths than other factors like community, age, and the total population. Therefore, temperature and humidity are salient parameters for predicting COVID-19 affected instances. Moreover, it is concluded that the higher the temperature, the lesser the mortality due to COVID-19 infection.


2018 ◽  
Vol 121 (3) ◽  
pp. 285-290 ◽  
Author(s):  
Jami L. Josefson ◽  
Michael Nodzenski ◽  
Octavious Talbot ◽  
Denise M. Scholtens ◽  
Patrick Catalano

AbstractNewborn adiposity, a nutritional measure of the maternal–fetal intra-uterine environment, is representative of future metabolic health. An anthropometric model using weight, length and flank skinfold to estimate neonatal fat mass has been used in numerous epidemiological studies. Air displacement plethysmography (ADP), a non-invasive technology to measure body composition, is impractical for large epidemiological studies. The study objective was to determine the consistency of the original anthropometric fat mass estimation equation with ADP. Full-term neonates were studied at 12–72 h of life with weight, length, head circumference, flank skinfold thickness and ADP measurements. Statistical analyses evaluated three models to predict neonatal fat mass. Lin’s concordance correlation coefficient, mean prediction error and root mean squared error between the predicted and observed ADP fat mass values were used to evaluate the models, where ADP was considered the gold standard. A multi-ethnic cohort of 468 neonates were studied. Models (M) for predicting fat mass were developed using 349 neonates from site 1, then independently evaluated in 119 neonates from site 2. M0 was the original anthropometric model, M1 used the same variables as M0 but with updated parameters and M2 additionally included head circumference. In the independent validation cohort, Lin’s concordance correlation estimates demonstrated reasonable accuracy (model 0: 0·843, 1: 0·732, 2: 0·747). Mean prediction error and root mean squared error in the independent validation was much smaller for M0 compared with M1 and M2. The original anthropometric model to estimate neonatal fat mass is reasonable for predicting ADP, thus we advocate its continued use in epidemiological studies.


Author(s):  
Mohammad Hossein Ahmadi ◽  
Alireza Baghban ◽  
Ely Salwana ◽  
Milad Sadeghzadeh ◽  
Mohammad Zamen ◽  
...  

Solar energy is a renewable resources of energy which is broadly utilized and have the least pollution impact between the available alternatives of fossil fuels. In this investigation, machine leaening approaches of neural networks (NN), neuro-fuzzy and least squares support vector machine (LSSVM) are used to build the models for prediction of the thermal performance of a photovoltaic-thermal solar collector (PV/T) by estimating its efficiency as an output of the model while inlet temperature, flow rate, heat, solar radiation, and heat of sun are input of the designed model. Experimental measurements was prepared by designing a solar collector system and 100 data extracted. Different analyses are also performed to examine the credibility of the introduced approaches revealing great performance. The suggested LSSVM model represented the best performance regarding the mean squared error (MSE) of 0.003 and correlation coefficient (R2) value of 0.99, respectively.


Sign in / Sign up

Export Citation Format

Share Document