maat: An R Package for Multiple Administrations Adaptive Testing

2021 ◽  
pp. 014662162110492
Author(s):  
Seung W. Choi ◽  
Sangdon Lim ◽  
Luping Niu ◽  
Sooyong Lee ◽  
Christina M. Schneider ◽  
...  

Multiple Administrations Adaptive Testing (MAAT) is an extension of the shadow-test approach to CAT for the assessment framework involving multiple tests administered periodically throughout the year. The maat package utilizes multiple item pools vertically scaled across grades and multiple phases (stages) within each test administration, allowing for transitioning from an item pool to another as deemed necessary to further enhance the quality of assessment.

1982 ◽  
Vol 6 (4) ◽  
pp. 473-492 ◽  
Author(s):  
David J. Weiss

Approaches to adaptive (tailored) testing based on item response theory are described and research results summarized. Through appropriate combinations of item pool design and use of different test termination criteria, adaptive tests can be designed (1) to improve both measurement quality and measurement efficiency, resulting in measurements of equal precision at all trait levels; (2) to improve measurement efficiency for test batteries using item pools designed for conventional test administration; and (3) to improve the accuracy and efficiency of testing for classification (e.g., mastery testing). Research results show that tests based on item response theory (IRT) can achieve measurements of equal precision at all trait levels, given an adequately designed item pool; these results contrast with those of conventional tests which require a tradeoff of bandwidth for fidelity/precision of measurements. Data also show reductions in bias, inaccuracy, and root mean square error of ability estimates. Improvements in test fidelity observed in simulation studies are supported by live-testing data, which showed adaptive tests requiring half the number of items as that of conventional tests to achieve equal levels of reliability, and almost one-third the number to achieve equal levels of validity. When used with item pools from conventional tests, both simulation and live-testing results show reductions in test battery length from conventional tests, with no reductions in the quality of measurements. Adaptive tests designed for dichotomous classification also represent improvements over conventional tests designed for the same purpose. Simulation studies show reductions in test length and improvements in classification accuracy for adaptive vs. conventional tests; live-testing studies in which adaptive tests were compared with "optimal" conventional tests support these findings. Thus, the research data show that IRT-based adaptive testing takes advantage of the capabilities of IRT to improve the quality and/or efficiency of measurement for each examinee.


2019 ◽  
Vol 79 (6) ◽  
pp. 1133-1155
Author(s):  
Emre Gönülateş

This article introduces the Quality of Item Pool (QIP) Index, a novel approach to quantifying the adequacy of an item pool of a computerized adaptive test for a given set of test specifications and examinee population. This index ranges from 0 to 1, with values close to 1 indicating the item pool presents optimum items to examinees throughout the test. This index can be used to compare different item pools or diagnose the deficiencies of a given item pool by quantifying the amount of deviation from a perfect item pool. Simulation studies were conducted to evaluate the capacity of this index for detecting the inadequacies of two simulated item pools. The value of this index was compared with the existing methods of evaluating the quality of computerized adaptive tests (CAT). Results of the study showed that the QIP Index can detect even slight deviations between a proposed item pool and an optimal item pool. It can also uncover shortcomings of an item pool that other outcomes of CAT cannot detect. CAT developers can use the QIP Index to diagnose the weaknesses of the item pool and as a guide for improving item pools.


2011 ◽  
Vol 38 (S 01) ◽  
Author(s):  
B Lindelius ◽  
E Björkenstam ◽  
C Dahlgren ◽  
R Ljung ◽  
C Stefansson

2021 ◽  
pp. 073428292110277
Author(s):  
Ioannis Tsaousis ◽  
Georgios D. Sideridis ◽  
Hannan M. AlGhamdi

This study evaluated the psychometric quality of a computerized adaptive testing (CAT) version of the general cognitive ability test (GCAT), using a simulation study protocol put forth by Han, K. T. (2018a). For the needs of the analysis, three different sets of items were generated, providing an item pool of 165 items. Before evaluating the efficiency of the GCAT, all items in the final item pool were linked (equated), following a sequential approach. Data were generated using a standard normal for 10,000 virtual individuals ( M = 0 and SD = 1). Using the measure’s 165-item bank, the ability value (θ) for each participant was estimated. maximum Fisher information (MFI) and maximum likelihood estimation with fences (MLEF) were used as item selection and score estimation methods, respectively. For item exposure control, the fade away method (FAM) was preferred. The termination criterion involved a minimum SE ≤ 0.33. The study revealed that the average number of items administered for 10,000 participants was 15. Moreover, the precision level in estimating the participant’s ability score was very high, as demonstrated by the CBIAS, CMAE, and CRMSE). It is concluded that the CAT version of the test is a promising alternative to administering the corresponding full-length measure since it reduces the number of administered items, prevents high rates of item exposure, and provides accurate scores with minimum measurement error.


Author(s):  
Silvanys L Rodríguez-Mercedes ◽  
Khushbu F Patel ◽  
Camerin A Rencken ◽  
Gabrielle G Grant ◽  
Kate Surette ◽  
...  

Abstract Introduction The transition from early childhood to teen years (5-12) is a critical time of development, which can be made particularly challenging by a burn injury. Assessing post-burn recovery during these years is important for improving pediatric survivors’ development and health outcomes. Few validated burn-specific measures exist for this age group. The purpose of this study was to generate item pools that will be used to create a future computerized adaptive test (CAT) assessing post-burn recovery in school-aged children. Methods Item pool development was guided by the previously developed School-Aged Life Impact Burn Recovery Evaluation (SA-LIBRE5-12) Conceptual Framework. The item pool development process involved a systematic literature review, extraction of candidate items from existing legacy measures, iterative item review during expert consensus meetings, and parent cognitive interviews. Results The iterative item review with experts consisted of six rounds. A total of 10 parent cognitive interviews were conducted. The three broad themes of concern were items that needed 1) clarification, needed context or were vague, 2) age dependence and relevance, and 3) word choice. The cognitive interviews indicated that survey instructions, recall period, item stem, and response choices were interpretable by respondents. Final item pool based on parental feedback consist of 57, 81, and 60 items in Physical, Psychological, and Family and Social Functioning respectively. Conclusion Developed item pools (n=198) in three domains are consistent with the existing conceptual framework. The next step involves field-testing the item pool and calibration using item response theory to develop and validate the SA-LIBRE5-12 CAT Profile.


2021 ◽  
Author(s):  
Jason Hunter ◽  
Mark Thyer ◽  
Dmitri Kavetski ◽  
David McInerney

<p>Probabilistic predictions provide crucial information regarding the uncertainty of hydrological predictions, which are a key input for risk-based decision-making. However, they are often excluded from hydrological modelling applications because suitable probabilistic error models can be both challenging to construct and interpret, and the quality of results are often reliant on the objective function used to calibrate the hydrological model.</p><p>We present an open-source R-package and an online web application that achieves the following two aims. Firstly, these resources are easy-to-use and accessible, so that users need not have specialised knowledge in probabilistic modelling to apply them. Secondly, the probabilistic error model that we describe provides high-quality probabilistic predictions for a wide range of commonly-used hydrological objective functions, which it is only able to do by including a new innovation that resolves a long-standing issue relating to model assumptions that previously prevented this broad application.  </p><p>We demonstrate our methods by comparing our new probabilistic error model with an existing reference error model in an empirical case study that uses 54 perennial Australian catchments, the hydrological model GR4J, 8 common objective functions and 4 performance metrics (reliability, precision, volumetric bias and errors in the flow duration curve). The existing reference error model introduces additional flow dependencies into the residual error structure when it is used with most of the study objective functions, which in turn leads to poor-quality probabilistic predictions. In contrast, the new probabilistic error model achieves high-quality probabilistic predictions for all objective functions used in this case study.</p><p>The new probabilistic error model and the open-source software and web application aims to facilitate the adoption of probabilistic predictions in the hydrological modelling community, and to improve the quality of predictions and decisions that are made using those predictions. In particular, our methods can be used to achieve high-quality probabilistic predictions from hydrological models that are calibrated with a wide range of common objective functions.</p>


Author(s):  
Syed Mustafa Ali ◽  
Farah Naureen ◽  
Arif Noor ◽  
Maged Kamel N. Boulos ◽  
Javariya Aamir ◽  
...  

Background Increasingly, healthcare organizations are using technology for the efficient management of data. The aim of this study was to compare the data quality of digital records with the quality of the corresponding paper-based records by using data quality assessment framework. Methodology We conducted a desk review of paper-based and digital records over the study duration from April 2016 to July 2016 at six enrolled TB clinics. We input all data fields of the patient treatment (TB01) card into a spreadsheet-based template to undertake a field-to-field comparison of the shared fields between TB01 and digital data. Findings A total of 117 TB01 cards were prepared at six enrolled sites, whereas just 50% of the records (n=59; 59 out of 117 TB01 cards) were digitized. There were 1,239 comparable data fields, out of which 65% (n=803) were correctly matched between paper based and digital records. However, 35% of the data fields (n=436) had anomalies, either in paper-based records or in digital records. 1.9 data quality issues were calculated per digital patient record, whereas it was 2.1 issues per record for paper-based record. Based on the analysis of valid data quality issues, it was found that there were more data quality issues in paper-based records (n=123) than in digital records (n=110). Conclusion There were fewer data quality issues in digital records as compared to the corresponding paper-based records. Greater use of mobile data capture and continued use of the data quality assessment framework can deliver more meaningful information for decision making.


2020 ◽  
Author(s):  
Maxim Ivanov ◽  
Albin Sandelin ◽  
Sebastian Marquardt

Abstract Background: The quality of gene annotation determines the interpretation of results obtained in transcriptomic studies. The growing number of genome sequence information calls for experimental and computational pipelines for de novo transcriptome annotation. Ideally, gene and transcript models should be called from a limited set of key experimental data. Results: We developed TranscriptomeReconstructoR, an R package which implements a pipeline for automated transcriptome annotation. It relies on integrating features from independent and complementary datasets: i) full-length RNA-seq for detection of splicing patterns and ii) high-throughput 5' and 3' tag sequencing data for accurate definition of gene borders. The pipeline can also take a nascent RNA-seq dataset to supplement the called gene model with transient transcripts.We reconstructed de novo the transcriptional landscape of wild type Arabidopsis thaliana seedlings as a proof-of-principle. A comparison to the existing transcriptome annotations revealed that our gene model is more accurate and comprehensive than the two most commonly used community gene models, TAIR10 and Araport11. In particular, we identify thousands of transient transcripts missing from the existing annotations. Our new annotation promises to improve the quality of A.thaliana genome research.Conclusions: Our proof-of-concept data suggest a cost-efficient strategy for rapid and accurate annotation of complex eukaryotic transcriptomes. We combine the choice of library preparation methods and sequencing platforms with the dedicated computational pipeline implemented in the TranscriptomeReconstructoR package. The pipeline only requires prior knowledge on the reference genomic DNA sequence, but not the transcriptome. The package seamlessly integrates with Bioconductor packages for downstream analysis.


2016 ◽  
Vol 45 (1) ◽  
pp. 55-69 ◽  
Author(s):  
Sebastian Warnholz ◽  
Timo Schmid

The demand for reliable regional estimates from sample surveys has been substantially grown over the last decades. Small area estimation provides statistical methods to produce reliable predictions when the sample sizes in regions are too small to apply direct estimators. Model- and design-based simulations are used to gain insights into the quality of the introduced methods. In this article we present a framework which may help to guarantee the reproducibility of simulation studies in articles and during research. The introduced R-package saeSim is adjusted to provide a simulation environment for the special case of small area estimation. The package may allow the prospective researcher during the research process to produce simulation studies with a minimal eort of coding.


2019 ◽  
Author(s):  
Céline Monteil ◽  
Fabrice Zaoui ◽  
Nicolas Le Moine ◽  
Frédéric Hendrickx

Abstract. Environmental modelling is complex, and models often require the calibration of several parameters that are not directly evaluable from a physical quantity or a field measurement. The R package caRamel has been designed to easily implement a multi-objective optimizer in the R environment to calibrate these parameters. A multiobjective calibration allows to find a compromise between different goals by defining a set of optimal parameters. The algorithm is a hybrid of the Multiobjective Evolutionary Annealing Simplex method (MEAS) and the Nondominated Sorting Genetic Algorithm II (ε-NSGA-II algorithm). The optimizer was initially developed for the calibration of hydrological models but can be used for any environmental model. The main function of the package, caRamel(), requires to define a multi-objective calibration function as well as bounds on the variation of the underlying parameters to optimize. CaRamel is well adapted to complex modelling. As an example, caRamel converges quickly and has a stable solution after 5,000 model evaluations with robust results for a real study case of a hydrological problem with 8 parameters and 3 objectives of calibration. The comparison with another well-known optimizer (i.e. MCO, for Multiple Criteria Optimization) confirms the quality of the algorithm.


Sign in / Sign up

Export Citation Format

Share Document