Commentary on Reardon, Kalogrides, and Ho’s “Validation Methods for Aggregate-Level Test Scale Linking: A Case Study Mapping School District Test Score Distributions to a Common Scale”

2020 ◽  
pp. 107699862094826
Author(s):  
Daniel Bolt

The studies presented by Reardon, Kalogrides, and Ho provide preliminary support for a National Assessment of Educational Progress–based aggregate linking of state assessments when used for research purposes. In this commentary, I suggest future efforts to explore possible sources of district-level bias, evaluation of predictive accuracy at the state level, and a better understanding of the performance of the linking when applied to the inevitable nonrepresentative district samples that will be encountered in research studies.

2019 ◽  
pp. 107699861987408 ◽  
Author(s):  
Sean F. Reardon ◽  
Demetra Kalogrides ◽  
Andrew D. Ho

Linking score scales across different tests is considered speculative and fraught, even at the aggregate level. We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that aggregate linkages can be validated both directly and indirectly under certain conditions such as when the scores for at least some target units (districts) are available on a common test (e.g., the National Assessment of Educational Progress). We introduce precision-adjusted random effects models to estimate linking error, for populations and for subpopulations, for averages and for progress over time. These models allow us to distinguish linking error from sampling variability and illustrate how linking error plays a larger role in aggregates with smaller sample sizes. Assuming that target districts generalize to the full population of districts, we can show that standard errors for district means are generally less than .2 standard deviation units, leading to reliabilities above .7 for roughly 90% of districts. We also show how sources of imprecision and linking error contribute to both within- and between-state district comparisons within versus between states. This approach is applicable whenever the essential counterfactual question—“what would means/variance/progress for the aggregate units be, had students taken the other test?”—can be answered directly for at least some of the units.


2020 ◽  
pp. 107699862095666
Author(s):  
Alina A. von Davier

In this commentary, I share my perspective on the goals of assessments in general, on linking assessments that were developed according to different specifications and for different purposes, and I propose several considerations for the authors and the readers. This brief commentary is structured around three perspectives (1) the context of this research, (2) the methodology proposed here, and (3) the consequences for applied research.


2020 ◽  
pp. 107699862096008
Author(s):  
Tim Moses ◽  
Neil J. Dorans

The Reardon, Kalogrides, and Ho article on validation methods for aggregate-level test scale linking is an attempt to validate a district-level scale aligning procedure that appears to be a new solution to an old problem. Their aligning procedure uses the National Assessment of Educational Progress (NAEP) scale to piece together a patchwork of data structures from different tests of different constructs obtained under different administration conditions and used in different ways by different states. In this article, we critique their linking and validation efforts. Our critique has three components. First, we review the recommendations for linking state assessments to NAEP from several studies and commentaries to provide background from which to interpret Reardon et al.’s validation attempts. Second, we provide a replication of the Reardon et al. empirical validations of its proposed linking procedure to demonstrate that correlations between district means on two test scores can be high even when (1) the constructs being measured by the tests are different and (2) the district-level means estimated using the Reardon et al. linking approach can differ substantially from actual district-level means. Then, we suggest additional checks for construct similarity and subpopulation invariance from other concordance studies that could be used to assess whether the inferences made by Reardon et al. are warranted. Finally, until such checks are made, we urge cautious use of the results of the Reardon et al. results.


2020 ◽  
pp. 107699862094917
Author(s):  
Mark L. Davison

This paper begins by setting the linking methods of Reardon, Kalogrides, and Ho in the broader literature on linking. Trends in the validity data suggest that there may be a conditional bias in the estimates of district means, but the data in the article are not conclusive on this point. Further, the data used in their case study might support the validity of the methods only over a limited range of the ability continuum. Applications of the method are then discussed. Contrary to the title, the application of the linking results is not limited to aggregate-level data. Because the potential application is so broad, further research is needed on issues such as the possibility of conditional bias and the validity of estimates over the full range of possible values. Validity is not a dichotomous concept where validity exists or it does not. The evidence reported by Reardon et al. provides substantial, but incomplete, support for the validity of the linked measures in this case study.


2017 ◽  
Author(s):  
David Skylan Chester

The Taylor Aggression Paradigm (TAP) is a frequently-used laboratory measure of aggressive behavior. However, the flexibility inherent in its implementation and analysis can undermine its validity. To test whether the TAP was a valid aggression measure irrespective of this flexibility, I conducted a preregistered study of a 25-trial version of the TAP using a single scoring approach with 160 diverse undergraduate participants. TAP scores showed agreement with other laboratory aggression measures and were magnified by an experimental provocation manipulation. Mixed evidence was found for associations with aggressive dispositions and real-world violence. These results provide preliminary support for this approach to the TAP to measure state-level aggressive behavior. However, more evidence is needed to assess the TAP’s external validity and ability to measure dispositional forms of aggression. Using preregistered designs, researchers should validate specific variants of their behavioral tasks in order to optimize the veridicality and reproducibility of psychological science.


2021 ◽  
Vol 8 (1) ◽  
pp. 1869367
Author(s):  
Sylvanus Sebbeh-Newton ◽  
Shaib Abdulazeez Shehu ◽  
Prosper Ayawah ◽  
Azupuri A. Kaba ◽  
Hareyani Zabidi

Author(s):  
Miguel A. Sánchez-Acevedo ◽  
Zaydi Anaí Acosta-Chi ◽  
Ma. del Rocío Morales-Salgado

Cardiovascular diseases are the main cause of mortality in the world. As more people suffer from diabetes and hypertension, the risk of cardiovascular disease (CVD) increases. A sedentary lifestyle, an unhealthy diet, and stressful activities are behaviors that can be changed to prevent CVD. Taking measures to prevent CVD lowers the cost of treatments and reduces mortality. Data-driven plans generate more effective results and can be applied to groups with similar characteristics. Currently, there are several databases that can be used to extract information in real time and improve decision making. This article proposes a methodology for the detection of CVD and a web tool to analyze the data more effectively. The methodology for extracting, describing, and visualizing data from a state-level case study of CVD in Mexico is presented. The data is obtained from the databases of the National Institute of Statistics and Geography (INEGI) and the National Survey of Health and Nutrition (ENSANUT). A k-nearest neighbor (KNN) algorithm is proposed to predict missing data.


Sign in / Sign up

Export Citation Format

Share Document