Contradictions Resolved: An Analysis of Two Theories of the Achievement Gap

Background Value-added modeling (VAM) has been used to rank teachers and assess teacher and school quality. The apparent relationship between value-added teacher rankings and gains in student performance provide a foundation for the view that the contribution of teachers to student performance is the largest factor influencing student achievement, suggesting that differences in teacher quality might explain the persistence of the gap in student achievement as students advance throughout the K–12 years. However, several studies raise questions about the reliability and validity of VAM. Purpose The purpose of this article is to reconcile the evidence that the contribution of teachers to student achievement is large with the evidence that value-added rankings are unreliable and possibly invalid. Design The method involves an analytical review of the available evidence, development of a theoretical explanation for the contradictory results, and a test of this explanation using path analysis with three longitudinal datasets involving nationally representative samples of schools and students. Conclusion The hypothesis that the contribution of teachers to student performance is the strongest factor influencing student achievement is not supported. A stronger factor is the degree to which students believe that they are proficient students. This is consistent with the view that the persistence of the achievement gap is better explained as the outcome of structural factors embedded in the conventional model of schooling that undermines the self-efficacy, engagement, effort, and achievement of students who enter kindergarten performing below the level of their more advantaged peers.

Download Full-text

Educational Accountability, Value-Added Modeling, and the Origin of the Achievement Gap

Education and Urban Society ◽

10.1177/0013124519896823 ◽

2020 ◽

Vol 52 (8) ◽

pp. 1181-1203

Author(s):

Stuart S. Yeh

Keyword(s):

Achievement Gap ◽

Black Students ◽

Student Performance ◽

School System ◽

Teacher Performance ◽

Value Added ◽

Random Assignment ◽

Grouping Practices ◽

Value Added Modeling ◽

K 12

Empirical results are consistent with the hypothesis that socioeconomic factors contribute to initial disparities in performance that are perpetuated by demoralizing grading, testing, and grouping practices throughout the K-12 years. The hypothesis may explain why the achievement gap increases after children enter the school system, why Black students lose ground within schools and within classrooms, why value-added modeling (VAM) estimates of teacher performance are unstable from year to year, why Rothstein found that VAM estimates of teacher performance predict prior student performance, why VAM estimates of teacher performance predict gains in student achievement, and why persistent sorting may account for the Gates Foundation’s Measures of Effective Teaching (MET) results despite random assignment of class rosters to teachers.

Download Full-text

Teacher Turnover, Teacher Quality, and Student Achievement in DCPS

Educational Evaluation and Policy Analysis ◽

10.3102/0162373716663646 ◽

2016 ◽

Vol 39 (1) ◽

pp. 54-76 ◽

Cited By ~ 39

Author(s):

Melinda Adnot ◽

Thomas Dee ◽

Veronica Katz ◽

James Wyckoff

Keyword(s):

Student Achievement ◽

Public Schools ◽

Student Performance ◽

Teacher Turnover ◽

School Quality ◽

Poor Performance ◽

Negative Effects ◽

Policy Regime ◽

Positive Effects ◽

Quasi Experimental

In practice, teacher turnover appears to have negative effects on school quality as measured by student performance. However, some simulations suggest that turnover can instead have large positive effects under a policy regime in which low-performing teachers can be accurately identified and replaced with more effective teachers. This study examines this question by evaluating the effects of teacher turnover on student achievement under IMPACT, the unique performance-assessment and incentive system in the District of Columbia Public Schools (DCPS). Employing a quasi-experimental design based on data from the first years of IMPACT, we find that, on average, DCPS replaced teachers who left with teachers who increased student achievement by 0.08 standard deviation ( SD) in math. When we isolate the effects of lower-performing teachers who were induced to leave DCPS for poor performance, we find that student achievement improves by larger and statistically significant amounts (i.e., 0.14 SD in reading and 0.21 SD in math). In contrast, the effect of exits by teachers not sanctioned under IMPACT is typically negative but not statistically significant.

Download Full-text

Measuring Investment in Education

The Journal of Economic Perspectives ◽

10.1257/jep.10.4.9 ◽

1996 ◽

Vol 10 (4) ◽

pp. 9-30 ◽

Cited By ~ 104

Author(s):

Eric A Hanushek

Keyword(s):

Human Capital ◽

Student Achievement ◽

Public Schools ◽

Education Policy ◽

Student Performance ◽

School Quality ◽

Policy Instruments ◽

Economic Research ◽

Performance Incentives ◽

Education Resources

Historic debates about the measurement of capital are even more complicated in the case of education and human capital. As extensive research demonstrates, education resources are not consistently related to student performance in existing elementary and secondary schools. This inefficiency in public schools implies that spending and resource measures do not accurately capture variations in school quality. This finding then has clear implications for both education policy and economic research. Because school inputs are poor policy instruments, an alternative policy focus that appears much more productive is performance incentives related to student achievement.

Download Full-text

Hierarchical Social Network Analysis Using a Multi-Agent System

International Journal of Agent Technologies and Systems ◽

10.4018/ijats.2013070102 ◽

2013 ◽

Vol 5 (3) ◽

pp. 14-32

Author(s):

Lizhu Ma ◽

Xin Zhang

Keyword(s):

Social Networks ◽

Student Performance ◽

Hierarchical Clustering ◽

Hierarchical Structure ◽

School Quality ◽

Multi Agent System ◽

Agent System ◽

Funding Policies ◽

Multi Agent ◽

K 12

The quality of K-12 education has been a major concern in the nation for years. School systems, just like many other social networks, appear to have a hierarchical structure. Understanding this structure could be the key to better evaluating student performance and improving school quality. Many studies have been focusing on detecting hierarchical structure by using hierarchical clustering algorithms. The authors design an interaction-based similarity measure to accomplish hierarchical clustering in order to detect hierarchical structures in social networks (e.g. school district networks). This method uses a multi-agent system, for it is based on agent interactions. With the network structure detected, they also built a model, which is based on the MAXQ algorithm, to decompose the funding policy task into subtasks and then evaluate these subtasks by using funding distribution policies from past years and looking for possible relationships between student performances and funding policies. For the experiment, the authors used real school data from Bexar county’s 15 school districts in Texas. The first result shows that their interaction-based method is able to generate meaningful clustering and dendrograms for social networks. Additionally the authors’ policy evaluation model is able to evaluate funding policies from the past three years in Bexar County and conclude that increasing funding does not necessarily have a positive impact on student performance and it is generally not the case that the more is spent, the better.

Download Full-text

A Reanalysis of the Effects of Teacher Replacement Using Value-Added Modeling

Teachers College Record ◽

10.1177/016146811311501202 ◽

2013 ◽

Vol 115 (12) ◽

pp. 1-35

Author(s):

Stuart S. Yeh

Keyword(s):

Student Achievement ◽

Large Scale ◽

Reliability And Validity ◽

Value Added ◽

Cost Effective ◽

Large Numbers ◽

Cost Effective Approach ◽

Need To Evaluate ◽

Value Added Modeling ◽

Lifetime Earnings

Background In principle, value-added modeling (VAM) might be justified if it can be shown to be a more reliable indicator of teacher quality than existing indicators for existing low-stakes decisions that are already being made, such as the award of small merit bonuses. However, a growing number of researchers now advocate the use of VAM to identify and replace large numbers of low-performing teachers. There is a need to evaluate these proposals because the active termination of large numbers of teachers based on VAM requires a much higher standard of reliability and validity. Furthermore, these proposals must be evaluated to determine if they are cost-effective compared to alternative proposals for raising student achievement. While VAM might be justified as a replacement for existing indicators (for existing decisions regarding merit compensation), it might not meet the higher standard of reliability and validity required for large-scale teacher termination, and it may not be the most cost-effective approach for raising student achievement. If society devotes its resources to approaches that are not cost-effective, the increase in achievement per dollar of resources expended will remain low, inhibiting reduction of the achievement gap. Objective This article reviews literature regarding the reliability and validity of VAM, then focuses on an evaluation of a proposal by Chetty, Friedman, and Rockoff to use VAM to identify and replace the lowest-performing 5% of teachers with average teachers. Chetty et al. estimate that implementation of this proposal would increase the achievement and lifetime earnings of students. The results appear likely to accelerate the adoption of VAM by school districts nationwide. The objective of the current article is to evaluate the Chetty et al. proposal and the strategy of raising student achievement by using VAM to identify and replace low-performing teachers. Method This article analyzes the assumptions of the Chetty et al. study and the assumptions of similar VAM-based proposals to raise student achievement. This analysis establishes a basis for evaluating the Chetty et al. proposal and, in general, a basis for evaluating all VAM-based policies to raise achievement. Conclusion VAM is not reliable or valid, and VAM-based polices are not cost-effective for the purpose of raising student achievement and increasing earnings by terminating large numbers of low-performing teachers.

Download Full-text

Distorting Value Added: The Use of Longitudinal, Vertically Scaled Student Achievement Data for Growth-Based, Value-Added Accountability

Journal of Educational and Behavioral Statistics ◽

10.3102/10769986031001035 ◽

2006 ◽

Vol 31 (1) ◽

pp. 35-62 ◽

Cited By ~ 58

Author(s):

Joseph A. Martineau

Keyword(s):

Student Achievement ◽

Student Performance ◽

Value Added ◽

Student Growth ◽

Future Research ◽

Achievement Data ◽

Value Added Models ◽

Value Added Modeling ◽

Construct Shift ◽

Vertical Scales

Longitudinal, student performance-based, value-added accountability models have become popular of late and continue to enjoy increasing popularity. Such models require student data to be vertically scaled across wide grade and developmental ranges so that the value added to student growth/achievement by teachers, schools, and districts may be modeled in an accurate manner. Many assessment companies provide such vertical scales and claim that those scales are adequate for longitudinal value-added modeling. However, psychometricians tend to agree that scales spanning wide grade/developmental ranges also span wide content ranges, and that scores cannot be considered exchangeable along the various portions of the scale. This shift in the constructs being measured from grade to grade jeopardizes the validity of inferences made from longitudinal value-added models. This study demonstrates mathematically that the use of such “construct-shifting” vertical scales in longitudinal, value-added models introduces remarkable distortions in the value-added estimates of the majority of educators. These distortions include (a) identification of effective teachers/schools as ineffective (and vice versa) simply because their students’ achievement is outside the developmental range measured well by “appropriate” grade-level tests, and (b) the attribution of prior teacher/school effects to later teachers/schools. Therefore, theories, models, policies, rewards, and sanctions based upon such value-added estimates are likely to be invalid because of distorted conclusions about educator effectiveness in eliciting student growth. This study identifies highly restrictive scenarios in which current value-added models can be validly applied in high-stakes and low-stakes research uses. This article further identifies one use of student achievement data for growth-based, value-added modeling that is not plagued by the problems of construct shift: the assessment of an upper grade content (e.g., fourth grade) in both the grade below and the appropriate grade to obtain a measure of student gain on a grade-specific mix of constructs. Directions for future research on methods to alleviate the problems of construct shift are identified as well.

Download Full-text

Proficiency in English is a better predictor of educational achievement than English as an Additional Language (EAL)

10.31219/osf.io/ph4fw ◽

2020 ◽

Author(s):

Annina Hessel ◽

Steve Strand

Keyword(s):

Student Achievement ◽

Student Performance ◽

Representative Sample ◽

Educational Achievement ◽

Explanatory Power ◽

Teacher Rating ◽

Economic Disadvantage ◽

Student Background ◽

Language Assessments ◽

Nationally Representative

We compared two tools that have been used to capture the linguistic heterogeneity and achievement of students in England: the exposure-based distinction between English as an additional language (EAL) and monolingual learners, and the 2017–2018 five level teacher rating of proficiency in English (from ‘New to English’ to ‘Fluent’). Based on a nationally representative sample of 140,000 students aged 5 to 16 years, we assessed the explanatory power of the proficiency in English rating in relation to educational achievement and compared it directly to EAL status. Our results demonstrate that proficiency in English is a significantly better predictor of student achievement than EAL status and that it accounts for up to six times more variance than other student background variables (ethnicity, gender and socio-economic disadvantage) combined. Proficiency in English was particularly (but certainly not solely) predictive for student performance in language-heavy subjects such as English and reading. Our findings are clear in demonstrating the value of a proficiency in English rating for assessing linguistic heterogeneity and student achievement, in contrast to the exposure-based EAL measure. We encourage the (re)introduction of proficiency in English ratings to monitor and support student progress and discuss the value of classroom-based language assessments.

Download Full-text

Alternative Assessment for Music Students with Significant Disabilities

The Oxford Handbook of Philosophical and Qualitative Assessment in Music Education ◽

10.1093/oxfordhb/9780190265182.013.31 ◽

2019 ◽

pp. 224-241

Author(s):

Donald DeVito ◽

Megan M. Sheridan ◽

Jian-Jun Chen-Edmund ◽

David Edmund ◽

Steven Bingham

Keyword(s):

Public School ◽

Music Education ◽

Students With Disabilities ◽

Student Performance ◽

Alternative Assessment ◽

Performance Outcomes ◽

General Music ◽

Significant Disabilities ◽

School Music ◽

K 12

How is it possible to move beyond assessment for the purposes of evaluating teacher proficiency and student performance outcomes and instead to consider assessment for understanding student musical experiences and preferences for the purpose of promoting lifelong musical engagement? This chapter includes and examines three distinct music education approaches that have been taken at the K–12 Sidney Lanier Center School for students with varying exceptionalities in Gainesville, Florida. Megan Sheridan illustrates inclusion and assessment using the Kodály approach. David Edmund and Jian-Jun Chen-Edmund examine creative lessons developed for exceptional learners in a general music setting. Steven Bingham and Donald DeVito illustrate adaptive jazz inclusion and performance for public school and university students with disabilities. This collaborative development in qualitative music assessment has taken place through (1) developing methods of communicating recognition of student engagement and affective responses during inclusive engagement in public school music education settings, specifically in Kodaly-based music instruction, K–12 general music classes, and secondary jazz ensembles; (2) using students’ interest and engagement as a means of curriculum development and assessment in inclusive public school music settings; and (3) building collaborative relationships with parents and the community for post-school lifelong music learning.

Download Full-text

The Psychometric Properties of the Measure of Adolescent Relationship Harassment and Abuse (MARSHA) With a Nationally Representative Sample of U.S. Youth

Journal of Interpersonal Violence ◽

10.1177/0886260520985480 ◽

2021 ◽

pp. 088626052098548

Author(s):

Emily F. Rothman ◽

Carlos A. Cuevas ◽

Elizabeth A. Mumford ◽

Eva Bahrami ◽

Bruce G. Taylor

Keyword(s):

Representative Sample ◽

Reliability And Validity ◽

Population Based ◽

Good Reliability ◽

Adolescent Dating ◽

Dating Abuse ◽

New Instrument ◽

Convergent And Divergent Validity ◽

Nationally Representative ◽

Adolescent Relationship

This article describes a new instrument that assesses adolescent dating abuse (ADA) victimization and perpetration. The Measure of Adolescent Relationship Harassment and Abuse (MARSHA) is a comprehensive instrument that includes items on physical, sexual, and psychological ADA, as well as cyber dating abuse, social control, and invasion of privacy. Data for this study came from a population-based, nationally representative sample of adolescents ages 11 to 21 years old ( N = 1,257). Exploratory factor analysis was conducted for the victimization and perpetration versions of the MARSHA, and convergent and divergent validity were assessed using the Conflict in Adolescent Dating Relationships Inventory (CADRI) and the juvenile victimization questionnaire (JVQ), respectively. Results suggest that the MARSHA has good reliability and validity, and that each subscale had good internal consistency. The authors propose that the MARSHA may be a strong alternative to the CADRI or the conflict tactics scale (CTS) because it reflects contemporary forms of abuse, such as online harassment and pressure to send nude selfies, and the nonconsensual dissemination of sexually explicit images.

Download Full-text