scholarly journals Analysis of Testing‐Based Forward Model Selection

Econometrica ◽  
2020 ◽  
Vol 88 (5) ◽  
pp. 2147-2173 ◽  
Author(s):  
Damian Kozbur

This paper analyzes a procedure called Testing‐Based Forward Model Selection (TBFMS) in linear regression problems. This procedure inductively selects covariates that add predictive power into a working statistical model before estimating a final regression. The criterion for deciding which covariate to include next and when to stop including covariates is derived from a profile of traditional statistical hypothesis tests. This paper proves probabilistic bounds, which depend on the quality of the tests, for prediction error and the number of selected covariates. As an example, the bounds are then specialized to a case with heteroscedastic data, with tests constructed with the help of Huber–Eicker–White standard errors. Under the assumed regularity conditions, these tests lead to estimation convergence rates matching other common high‐dimensional estimators including Lasso.


2017 ◽  
Vol 107 (5) ◽  
pp. 266-269 ◽  
Author(s):  
Damian Kozbur

This paper defines and studies a variable selection procedure called Testing-Based Forward Model Selection. The procedure inductively selects covariates which increase predictive accuracy into a working statistical regression model until a stopping criterion is met. The stopping criteria and selection criteria are defined using statistical hypothesis tests. The paper explicitly describes a testing procedure in the context of high-dimensional linear regression with heteroskedastic disturbances. Finally, a simulation study examines finite sample performance of the proposed procedure and shows that it behaves favorably in high-dimensional sparse settings in terms of prediction error and size of selected model.



2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
Marcelo Lopes ◽  
Angelo Karaboyas ◽  
Kazuhiko Tsuruya ◽  
Issa Al Salmi ◽  
Nidhi Sukul ◽  
...  

Abstract Background and Aims Chronic kidney disease-associated pruritus (CKD-aP) has been linked with comorbid conditions, and poorer mental and physical health-related quality-of-life (HR-QOL) in hemodialysis (HD) patients. The Skindex-10 questionnaire and a single itch-related question from the KDQOL-36 have been used to evaluate the impact of pruritus in HD patients. In this analysis, we investigated the performance of the single question and the Skindex-10 as predictors of HR-QOL in HD patients. Method We analyzed data from 4940 HD patients from 17 countries enrolled during year 2 of phase 5 of the Dialysis Outcomes and Practice Patterns Study (DOPPS, 2013): Belgium, Canada, Germany, the Gulf Cooperation Council (GCC) (Bahrain, Kuwait, Oman, Qatar, Saudi Arabia, United Arab Emirates), Italy, Japan, Russia, Spain, Sweden, Turkey, the UK, and the US. The Skindex-10 scores were calculated as per Mathur et al. (2010): responses to each of the 10 questions (0-6 scale), pertaining to how often patients were bothered by itchy skin in the past week, were summed to create a total summary score (range 0-60, with 0 indicating not at all bothered) and 3 subdomain scores [i.e., itching (disease) and its impact on mood/emotional and social functioning]. The itch-related single question from the KDQOL-36 asked: “During the past 4 weeks, to what extent were you bothered by itchy skin?” with response options including “not at all, somewhat, moderately, very much, extremely”. Itch-related measures were collected concurrently with HR-QOL measures: Physical (PCS) and Mental (MCS) Component Summary scores, derived from the SF-12. We calculated the Spearman correlation coefficient between the Skindex-10 (total score and for each of its 3 domains) and the single question. We used separate linear regression models to evaluate the predictive power of 1) the Skindex-10 score, 2) the single itch question, and 3) both, on PCS and MCS outcomes, based on R-squared values. Results Skindex-10 scores varied across countries; the proportion of patients with a very high Skindex-10 score (≥50) ranged from 12% in the GCC to only 2% in Italy, Russia and Sweden. Across all countries, 55% had a Skindex-10 score=0. For the single pruritus question, 37% answered that they were not at all bothered while 16% were very much or extremely bothered by itchy skin. The correlation between the single question and Skindex-10 was 0.71 overall, 0.72 for the disease domain, 0.62 for the social domain, and 0.70 for the emotional domain. Patient characteristics were similar across categories of both pruritus measures. Regression analyses showed that every 10 points higher in the Skindex-10 score was associated with 1.2 point lower PCS (95% CI: -1.4, -0.9) and 1.5 point lower MCS (95% CI: -1.7, -1.3) scores. Similarly, the single question showed increasingly poorer PCS and MCS scores with a greater degree of being bothered by pruritus: compared with patients not at all bothered by itchy skin, patients who were moderately bothered had 4.8 point lower PCS (-5.7, -3.9) and 4.3 point lower MCS (-5.3, -3.3) scores. The R-squared for PCS was 0.065 when using the single question and only 0.033 when using the Skindex-10 as the predictor. R-squared was also higher for MCS when using the single question (0.056) vs. Skindex-10 (0.052). When including both pruritus measures, the predictive power for PCS did not improve compared to the single question (R2=0.065), while increasing only slightly (R2=0.063) for MCS. Conclusion The single KDQOL-36 question about the extent bothered by itchy skin over the past 4 weeks was highly correlated with the Skindex-10 score and at least as predictive – if not more – of key HR-QOL measures as the Skindex-10. In daily clinical practice, utilizing 1 simple question about the extent patients are bothered by itchy skin can be a feasible and efficient way for routine assessment of pruritus to better identify HD patients with not only CKD-aP but also poorer HR-QoL.



2016 ◽  
Vol 5 (3) ◽  
pp. 61-78
Author(s):  
Magdalena Petrovska ◽  
Aneta Krstevska ◽  
Nikola Naumovski

Abstract This paper aims at assessing the usefulness of leading indicators in business cycle research and forecast. Initially we test the predictive power of the economic sentiment indicator (ESI) within a static probit model as a leading indicator, commonly perceived to be able to provide a reliable summary of the current economic conditions. We further proceed analyzing how well an extended set of indicators performs in forecasting turning points of the Macedonian business cycle by employing the Qual VAR approach of Dueker (2005). In continuation, we evaluate the quality of the selected indicators in pseudo-out-of-sample context. The results show that the use of survey-based indicators as a complement to macroeconomic data work satisfactory well in capturing the business cycle developments in Macedonia.





2007 ◽  
Vol 16 (06) ◽  
pp. 1093-1113 ◽  
Author(s):  
N. S. THOMAIDIS ◽  
V. S. TZASTOUDIS ◽  
G. D. DOUNIAS

This paper compares a number of neural network model selection approaches on the basis of pricing S&P 500 stock index options. For the choice of the optimal architecture of the neural network, we experiment with a “top-down” pruning technique as well as two “bottom-up” strategies that start with simple models and gradually complicate the architecture if data indicate so. We adopt methods that base model selection on statistical hypothesis testing and information criteria and we compare their performance to a simple heuristic pruning technique. In the first set of experiments, neural network models are employed to fit the entire options surface and in the second they are used as parts of a hybrid intelligence scheme that combines a neural network model with theoretical option-pricing hints.



2007 ◽  
Vol 56 (6) ◽  
pp. 95-103 ◽  
Author(s):  
I. Nopens ◽  
N. Nere ◽  
P.A. Vanrolleghem ◽  
D. Ramkrishna

Many systems contain populations of individuals. Often, they are regarded as a lumped phase, which might, for some applications, lead to inadequate model predictive power. An alternative framework, Population Balance Models, has been used here to describe such a system, activated sludge flocculation in which particle size is the property one wants to model. An important problem to solve in population balance modelling is to determine the model structure that adequately describes experimentally obtained data on for instance, the time evolution of the floc size distribution. In this contribution, an alternative method based on solving the inverse problem is used to recover the model structure from the data. In this respect, the presence of similarity in the data simplifies the problem significantly. Similarity was found and the inverse problem could be solved. A forward simulation then confirmed the quality of the model structure to describe the experimental data.



2019 ◽  
Vol 7 (7) ◽  
pp. 232596711985444 ◽  
Author(s):  
Philipp Niemeyer ◽  
Volker Laute ◽  
Wolfgang Zinser ◽  
Christoph Becher ◽  
Thomas Kolombe ◽  
...  

Background:Autologous chondrocyte implantation (ACI) and microfracture are established treatments for large, full-thickness cartilage defects, but there is still a need to expand the clinical and health economic knowledge of these procedures.Purpose:To confirm the noninferiority of ACI compared with microfracture.Study Design:Randomized controlled trial; Level of evidence, 2.Methods:Patients were randomized to be treated with matrix-associated ACI using spheroid technology (n = 52) or microfracture (n = 50). Both procedures followed standard methods. Patients were assessed by the Knee injury and Osteoarthritis Outcome Score (KOOS), MOCART (magnetic resonance observation of cartilage repair tissue) scoring system, Bern score, modified Lysholm score, International Cartilage Repair Society (ICRS) rating (histological and immunochemical scoring after rebiopsy 24 months after implantation), and International Knee Documentation Committee (IKDC) examination form. The main assessments were conducted 24 months after study treatment.Results:In the primary intention-to-treat analysis, the overall KOOS score for both ACI and microfracture yielded a statistically significant improvement relative to baseline. According to the between-group analysis, ACI passed the test of noninferiority compared with microfracture; thus, the primary goal of the study was achieved. The KOOS subscores yielded the same qualitative results as the overall KOOS score (ie, for each of these, noninferiority was demonstrated), and in 1 case (Activities of Daily Living subscore), the threshold for superiority was passed. The subgroup analyses did not yield any clear evidence of an association between treatment effect and any of the categories investigated (age, diagnosis, defect localization, sex). A histological analysis of biopsies from 16 patients (ACI: n = 9; microfracture: n = 7) suggested a better quality of repair in the patients treated with ACI.Conclusion:The efficacy of both ACI and microfracture was demonstrated with respect to both functional outcomes and morphological repair. The primary analysis confirmed the statistical hypothesis of the noninferiority of ACI, even for relatively small cartilage defects (1-4 cm2) treated in this study, the indication for which microfracture is generally accepted as the standard of care. ACI showed significant superiority in the KOOS subscores of Activities of Daily Living at 24 months and Knee-related Quality of Life at 12 months.Registration:NCT01222559 ( ClinicalTrials.gov identifier).



2018 ◽  
Vol 55 (4) ◽  
pp. 1001-1013
Author(s):  
Catherine Aaron ◽  
Olivier Bodart

Abstract Consider a sample 𝒳n={X1,…,Xn} of independent and identically distributed variables drawn with a probability distribution ℙX supported on a compact set M⊂ℝd. In this paper we mainly deal with the study of a natural estimator for the geodesic distance on M. Under rather general geometric assumptions on M, we prove a general convergence result. Assuming M to be a compact manifold of known dimension d′≤d, and under regularity assumptions on ℙX, we give an explicit convergence rate. In the case when M has no boundary, knowledge of the dimension d′ is not needed to obtain this convergence rate. The second part of the work consists in building an estimator for the Fréchet expectations on M, and proving its convergence under regularity conditions, applying the previous results.



2018 ◽  
Vol 210 ◽  
pp. 02016 ◽  
Author(s):  
Tomasz Rymarczyk ◽  
Grzegorz Kłosowski

The article presents four selected methods of supervised machine learning, which can be successfully used in the tomography of flood embankments, walls, tanks, reactors and pipes. A comparison of the following methods was made: Artificial Neural Networks (ANN), Supported Vector Machine (SVM), K-Nearest Neighbour (KNN) and Multivariate Adaptive Regression Splines (MAR Splines). All analysed methods concerned regression problems. Thanks to performed analysis the differences expressed quantitatively were visualized with the use of indicators such as regression, error of mean square deviation, etc. Moreover, an innovative method of denoising tomographic output images with the use of convolutional auto-encoders was presented. Thanks to the use of a convolutional structure composed of two auto-encoders, a significant improvement in the quality of the output image from the ECT tomography was achieved.





Sign in / Sign up

Export Citation Format

Share Document