State and Local Efforts to Investigate the Validity and Reliability of Scores from Teacher Evaluation Systems

2014 ◽  
Vol 116 (1) ◽  
pp. 1-28 ◽  
Author(s):  
Corinne Herlihy ◽  
Ezra Karger ◽  
Cynthia Pollard ◽  
Heather C. Hill ◽  
Matthew A. Kraft ◽  
...  

Context In the past two years, states have implemented sweeping reforms to their teacher evaluation systems in response to Race to the Top legislation and, more recently, NCLB waivers. With these new systems, policymakers hope to make teacher evaluation both more rigorous and more grounded in specific job performance domains such as teaching quality and contributions to student outcomes. Attaching high stakes to teacher scores has prompted an increased focus on the reliability and validity of these scores. Teachers unions have expressed strong concerns about the reliability and validity of using student achievement data to evaluate teachers and the potential for subjective ratings by classroom observers to be biased. The legislation enacted by many states also requires scores derived from teacher observations and the overall systems of teacher evaluation to be valid and reliable. Focus of the Study In this paper, we explore how state education officials and their district and local partners plan to implement and evaluate their teacher evaluation systems, focusing in particular on states’ efforts to investigate the reliability and validity of scores emerging from the observational component of these systems. Research Design Through document analysis and interviews with state education officials, we explore several issues that arise in observational systems, including the overall generalizability of teacher scores; the training, certification, and reliability of observers; and specifications regarding the sampling and number of lessons observed per teacher. Findings Respondents’ reports suggest that states are attending to the reliability and validity of scores, but inconsistently; in only a few states does there appear to be a coherent strategy regarding reliability and validity in place. Conclusions There remain a variety of system design and implementation decisions that states can optimize to increase the reliability and validity of their teacher evaluation scores. While a state may engage in auditing scores, for instance, it may miss the gains to reliability and validity that would accrue from periodic rater retraining and recertification, a stiff program of rater monitoring, and the use of multiple raters per teacher. Most troublesome are decisions about which and how many lessons to sample, which are either mandated legislatively, result from practical concerns or negotiations between stakeholders, or, at best case, rest on broad research not directly related to the state context. This suggests that states should more actively investigate the number of lessons and lesson sampling designs required to yield high-quality scores.

Author(s):  
Cara Faith Bernard ◽  
Joseph Michael Abramo

This chapter provides a background on laws and policies related to teacher evaluation in the United States. This background might help music teachers navigate teacher evaluation systems and avoid misunderstanding evaluators’ motivations and pressures. First, the chapter provides a brief history of federal and state education law and education policy. This history is presented as a series of four phases, each aimed to standardize public education. These phases move from evaluating standards through student performance and standardized tests to teacher performance and quality through instructional practice. Second, there is an investigation of how history and policy have led to tensions, disagreements, and contradictions within teacher evaluation processes and policies. Finally, the chapter describes how these tensions have resulted in the common characteristics of teacher evaluation systems found throughout the country. By understanding this background and history, music teachers may begin to actively and constructively participate in teacher evaluation.


Author(s):  
Dru Davison ◽  
Ryan Fisher

This chapter provides an in-depth analysis of the development and implementation of an alternative student growth measures system specifically designed by arts educators to provide teachers with a fair, flexible, and rigorous method of demonstrating teacher effectiveness as part of a multiple measures teacher evaluation system. We also present a brief overview of the Race to the Top legislation as well as the No Child Left Behind waivers in the United States as they relate to the increased attention to the use of student growth and achievement data in teacher evaluation systems. An overview of the multiple-measures evaluation systems with particular attention to the use of student growth data portion of the multiple measures is also included. The initial guidance from the US Department of Education regarding various approaches of incorporating student growth data in teacher evaluation systems is discussed. Implications for music education are also presented.


Author(s):  
Noelle A Paufler

Since the adoption of teacher evaluation systems that rely, at least in part, on controversial student achievement measures, little research has been conducted that focuses on stakeholders’ perceptions of systems in practice, specifically the perceptions of school principals. This study was conducted in a large urban school district to better understand principals’ perceptions of evaluating teachers based on professional and instructional practices as well as student achievement (i.e., value-added scores). Principals in this study strongly expressed concerns regarding: (a) the negative impact of the teacher evaluation system on district culture and morale; (b) their lack of autonomy in evaluating teachers and making staffing decisions; and (c) their perceived lack of value as professionals in the district. Examining the implications of teacher evaluation systems, per the experiences of principals as practitioners, is increasingly important if state and local policymakers as well as the general public are to better understand the intended and unintended consequences of these systems in practice.


2013 ◽  
Vol 83 (2) ◽  
pp. 371-384 ◽  
Author(s):  
Heather Hill ◽  
Pam Grossman

In this article, Heather C. Hill and Pam Grossman discuss the current focus on using teacher observation instruments as part of new teacher evaluation systems being considered and implemented by states and districts. They argue that if these teacher observation instruments are to achieve the goal of supporting teachers in improving instructional practice, they must be subject-specific, involve content experts in the process of observation, and provide information that is both accurate and useful for teachers. They discuss the instruments themselves, raters and system design, and timing of and feedback from the observations. They conclude by outlining the challenges that policy makers face in designing observation systems that will work to improve instructional practice at scale.


Author(s):  
Susan K. S. Grigsby

Current teacher evaluation systems place value on collaborative relationships between educators. Common Core Standards strongly suggest interdisciplinary units of study and make several references to reading and literacy in all subject areas. While current trends in education point to collaboration and a more holistic approach to learning, the practice remains a hit-or-miss exercise. The school librarian is in the unique position of being able to see connections among the different subject areas by virtue of his or her expertise on the resources available to each of these areas. When he or she is able to assist in the creation of curricular units that incorporate a variety of subjects into lesson plans, students are free to make connections that may have eluded them in single-subject studies. This chapter discusses one librarian’s approach to collaboration that focuses on relationships, connections, and creativity.


Author(s):  
Ron Astor ◽  
Rami Benbenishty

Even those who agree with the idea of creating a monitoring system might still need to be convinced that what students have to say should be considered valuable input in the effort to improve schools, whether it pertains to raising academic performance or to safety, security, and behavior. Some argue that students are so disinterested in surveys that they answer randomly or give the first answer that comes to mind. Others are concerned that students respond deliberately in ways intended to harm staff members they do not like. Still others are not sure that students really understand the true meaning of the questions and, therefore, that their answers are not usable. Students, however, are often the best sources of providing detailed information on what is happening in schools and may even provide realistic suggestions on how adults can intervene. Looking at the ways students’ perceptions are already being used in schools can help policymakers and educators see how they can be part of improving school climate. This issue, for example, has been debated in recent years as some states and school districts have moved to include students’ opinions on their experiences in the classroom as one component of new teacher evaluation systems. For example, the Tripod survey,1 developed by Harvard University’s Ron Ferguson, asks students how much they agree with statements such as “My teacher explains diffcult things clearly” and “Our class stays busy and doesn’t waste time.” The Tripod was used as part of the Bill and Melinda Gates Foundation’s Measures of Effective Teaching project and is being used in districts across the United States, in Canada, and in China. In a 2013 report, Hanover Research reviewed the literature on using student perception surveys in teacher evaluation and professional development: “Given the consistent findings of the research reviewed for this report, it is reasonable to conclude that student perception surveys can provide accurate measures of teacher effectiveness,” they write. “When the proper instrument, or survey, is utilized, student feedback can be more accurate than alternative, more widely- used instruments at predicting achievement gains.


Sign in / Sign up

Export Citation Format

Share Document