scholarly journals Model Selection in Continuous Test Norming With GAMLSS

Assessment ◽  
2017 ◽  
Vol 26 (7) ◽  
pp. 1329-1346 ◽  
Author(s):  
Lieke Voncken ◽  
Casper J. Albers ◽  
Marieke E. Timmerman

To compute norms from reference group test scores, continuous norming is preferred over traditional norming. A suitable continuous norming approach for continuous data is the use of the Box–Cox Power Exponential model, which is found in the generalized additive models for location, scale, and shape. Applying the Box–Cox Power Exponential model for test norming requires model selection, but it is unknown how well this can be done with an automatic selection procedure. In a simulation study, we compared the performance of two stepwise model selection procedures combined with four model-fit criteria (Akaike information criterion, Bayesian information criterion, generalized Akaike information criterion (3), cross-validation), varying data complexity, sampling design, and sample size in a fully crossed design. The new procedure combined with one of the generalized Akaike information criterion was the most efficient model selection procedure (i.e., required the smallest sample size). The advocated model selection procedure is illustrated with norming data of an intelligence test.

2018 ◽  
Vol 43 (4) ◽  
pp. 272-289 ◽  
Author(s):  
Sedat Sen ◽  
Allan S. Cohen ◽  
Seock-Ho Kim

Mixture item response theory (MixIRT) models can sometimes be used to model the heterogeneity among the individuals from different subpopulations, but these models do not account for the multilevel structure that is common in educational and psychological data. Multilevel extensions of the MixIRT models have been proposed to address this shortcoming. Successful applications of multilevel MixIRT models depend in part on detection of the best fitting model. In this study, performance of information indices, Akaike information criterion (AIC), Bayesian information criterion (BIC), consistent Akaike information criterion (CAIC), and sample-size adjusted Bayesian information criterion (SABIC), were compared for use in model selection with a two-level mixture Rasch model in the context of a real data example and a simulation study. Level 1 consisted of students and Level 2 consisted of schools. The performances of the model selection criteria under different sample sizes were investigated in a simulation study. Total sample size (number of students) and Level 2 sample size (number of schools) were studied for calculation of information criterion indices to examine the performance of these fit indices. Simulation study results indicated that CAIC and BIC performed better than the other indices at detection of the true (i.e., generating) model. Furthermore, information indices based on total sample size yielded more accurate detections than indices at Level 2.


Author(s):  
Mark David Walker ◽  
Mihály Sulyok

Abstract Background Restrictions on social interaction and movement were implemented by the German government in March 2020 to reduce the transmission of coronavirus disease 2019 (COVID-19). Apple's “Mobility Trends” (AMT) data details levels of community mobility; it is a novel resource of potential use to epidemiologists. Objective The aim of the study is to use AMT data to examine the relationship between mobility and COVID-19 case occurrence for Germany. Is a change in mobility apparent following COVID-19 and the implementation of social restrictions? Is there a relationship between mobility and COVID-19 occurrence in Germany? Methods AMT data illustrates mobility levels throughout the epidemic, allowing the relationship between mobility and disease to be examined. Generalized additive models (GAMs) were established for Germany, with mobility categories, and date, as explanatory variables, and case numbers as response. Results Clear reductions in mobility occurred following the implementation of movement restrictions. There was a negative correlation between mobility and confirmed case numbers. GAM using all three categories of mobility data accounted for case occurrence as well and was favorable (AIC or Akaike Information Criterion: 2504) to models using categories separately (AIC with “driving,” 2511. “transit,” 2513. “walking,” 2508). Conclusion These results suggest an association between mobility and case occurrence. Further examination of the relationship between movement restrictions and COVID-19 transmission may be pertinent. The study shows how new sources of online data can be used to investigate problems in epidemiology.


2020 ◽  
Vol 79 (Suppl 1) ◽  
pp. 1252.2-1253
Author(s):  
R. Garofoli ◽  
M. Resche-Rigon ◽  
M. Dougados ◽  
D. Van der Heijde ◽  
C. Roux ◽  
...  

Background:Axial spondyloarthritis (axSpA) is a chronic rheumatic disease that encompasses various clinical presentations: inflammatory chronic back pain, peripheral manifestations and extra-articular manifestations. The current nomenclature divides axSpA in radiographic (in the presence of radiographic sacroiliitis) and non-radiographic (in the absence of radiographic sacroiliitis, with or without MRI sacroiliitis. Given that the functional burden of the disease appears to be greater in patients with radiographic forms, it seems crucial to be able to predict which patients will be more likely to develop structural damage over time. Predictive factors for radiographic progression in axSpA have been identified through use of traditional statistical models like logistic regression. However, these models present some limitations. In order to overcome these limitations and to improve the predictive performance, machine learning (ML) methods have been developed.Objectives:To compare ML models to traditional models to predict radiographic progression in patients with early axSpA.Methods:Study design: prospective French multicentric cohort study (DESIR cohort) with 5years of follow-up. Patients: all patients included in the cohort, i.e. 708 patients with inflammatory back pain for >3 months but <3 years, highly suggestive of axSpA. Data on the first 5 years of follow-up was used. Statistical analyses: radiographic progression was defined as progression either at the spine (increase of at least 1 point per 2 years of mSASSS scores) or at the sacroiliac joint (worsening of at least one grade of the mNY score between 2 visits). Traditional modelling: we first performed a bivariate analysis between our outcome (radiographic progression) and explanatory variables at baseline to select the variables to be included in our models and then built a logistic regression model (M1). Variable selection for traditional models was performed with 2 different methods: stepwise selection based on Akaike Information Criterion (stepAIC) method (M2), and the Least Absolute Shrinkage and Selection Operator (LASSO) method (M3). We also performed sensitivity analysis on all patients with manual backward method (M4) after multiple imputation of missing data. Machine learning modelling: using the “SuperLearner” package on R, we modelled radiographic progression with stepAIC, LASSO, random forest, Discrete Bayesian Additive Regression Trees Samplers (DBARTS), Generalized Additive Models (GAM), multivariate adaptive polynomial spline regression (polymars), Recursive Partitioning And Regression Trees (RPART) and Super Learner. Finally, the accuracy of traditional and ML models was compared based on their 10-foldcross-validated AUC (cv-AUC).Results:10-fold cv-AUC for traditional models were 0.79 and 0.78 for M2 and M3, respectively. The 3 best models in the ML algorithm were the GAM, the DBARTS and the Super Learner models, with 10-fold cv-AUC of: 0.77, 0.76 and 0.74, respectively (Table 1).Table 1.Comparison of 10-fold cross-validated AUC between best traditional and machine learning models.Best modelsCross-validated AUCTraditional models M2 (step AIC method)0.79 M3 (LASSO method)0.78Machine learning approach SL Discrete Bayesian Additive Regression Trees Samplers (DBARTS)0.76 SL Generalized Additive Models (GAM)0.77 Super Learner0.74AUC: Area Under the Curve; AIC: Akaike Information Criterion; LASSO: Least Absolute Shrinkage and Selection Operator; SL: SuperLearner. N = 295.Conclusion:Traditional models predicted better radiographic progression than ML models in this early axSpA population. Further ML algorithms image-based or with other artificial intelligence methods (e.g. deep learning) might perform better than traditional models in this setting.Acknowledgments:Thanks to the French National Society of Rheumatology and the DESIR cohort.Disclosure of Interests:Romain Garofoli: None declared, Matthieu resche-rigon: None declared, Maxime Dougados Grant/research support from: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Consultant of: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Speakers bureau: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Désirée van der Heijde Consultant of: AbbVie, Amgen, Astellas, AstraZeneca, BMS, Boehringer Ingelheim, Celgene, Cyxone, Daiichi, Eisai, Eli-Lilly, Galapagos, Gilead Sciences, Inc., Glaxo-Smith-Kline, Janssen, Merck, Novartis, Pfizer, Regeneron, Roche, Sanofi, Takeda, UCB Pharma; Director of Imaging Rheumatology BV, Christian Roux: None declared, Anna Moltó Grant/research support from: Pfizer, UCB, Consultant of: Abbvie, BMS, MSD, Novartis, Pfizer, UCB


2014 ◽  
Vol 2014 ◽  
pp. 1-13
Author(s):  
Qichang Xie ◽  
Meng Du

The essential task of risk investment is to select an optimal tracking portfolio among various portfolios. Statistically, this process can be achieved by choosing an optimal restricted linear model. This paper develops a statistical procedure to do this, based on selecting appropriate weights for averaging approximately restricted models. The method of weighted average least squares is adopted to estimate the approximately restricted models under dependent error setting. The optimal weights are selected by minimizing ak-class generalized information criterion (k-GIC), which is an estimate of the average squared error from the model average fit. This model selection procedure is shown to be asymptotically optimal in the sense of obtaining the lowest possible average squared error. Monte Carlo simulations illustrate that the suggested method has comparable efficiency to some alternative model selection techniques.


Economies ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 49 ◽  
Author(s):  
Waqar Badshah ◽  
Mehmet Bulut

Only unstructured single-path model selection techniques, i.e., Information Criteria, are used by Bounds test of cointegration for model selection. The aim of this paper was twofold; one was to evaluate the performance of these five routinely used information criteria {Akaike Information Criterion (AIC), Akaike Information Criterion Corrected (AICC), Schwarz/Bayesian Information Criterion (SIC/BIC), Schwarz/Bayesian Information Criterion Corrected (SICC/BICC), and Hannan and Quinn Information Criterion (HQC)} and three structured approaches (Forward Selection, Backward Elimination, and Stepwise) by assessing their size and power properties at different sample sizes based on Monte Carlo simulations, and second was the assessment of the same based on real economic data. The second aim was achieved by the evaluation of the long-run relationship between three pairs of macroeconomic variables, i.e., Energy Consumption and GDP, Oil Price and GDP, and Broad Money and GDP for BRICS (Brazil, Russia, India, China and South Africa) countries using Bounds cointegration test. It was found that information criteria and structured procedures have the same powers for a sample size of 50 or greater. However, BICC and Stepwise are better at small sample sizes. In the light of simulation and real data results, a modified Bounds test with Stepwise model selection procedure may be used as it is strongly theoretically supported and avoids noise in the model selection process.


2019 ◽  
Vol 15 (2) ◽  
Author(s):  
Severin Guy Mahiane ◽  
Carel Pretorius ◽  
Eline Korenromp

Abstract This paper presents two approaches to smoothing time trends in prevalence and estimating the underlying incidence of remissible infections. In the first approach, we use second order segmented polynomials to smooth a curve in a bounded domain. In the second, incidence is modeled instead and the prevalence is reconstructed using the recovery rate which is assumed to be known. In both approaches, the number of knots and their positions are estimated, resulting in non-linear regressions. Akaike Information Criterion is used for model selection. The method is illustrated with Syphilis and Gonorrhea prevalence smoothing and incidence trend estimation in Guinea-Bissau and South Africa, respectively.


Sign in / Sign up

Export Citation Format

Share Document