Declining Morale, Diminishing Autonomy, and Decreasing Value: Principal Reflections on a High-Stakes Teacher Evaluation System

Since the adoption of teacher evaluation systems that rely, at least in part, on controversial student achievement measures, little research has been conducted that focuses on stakeholders’ perceptions of systems in practice, specifically the perceptions of school principals. This study was conducted in a large urban school district to better understand principals’ perceptions of evaluating teachers based on professional and instructional practices as well as student achievement (i.e., value-added scores). Principals in this study strongly expressed concerns regarding: (a) the negative impact of the teacher evaluation system on district culture and morale; (b) their lack of autonomy in evaluating teachers and making staffing decisions; and (c) their perceived lack of value as professionals in the district. Examining the implications of teacher evaluation systems, per the experiences of principals as practitioners, is increasingly important if state and local policymakers as well as the general public are to better understand the intended and unintended consequences of these systems in practice.

Download Full-text

State and Local Efforts to Investigate the Validity and Reliability of Scores from Teacher Evaluation Systems

Teachers College Record ◽

10.1177/016146811411600108 ◽

2014 ◽

Vol 116 (1) ◽

pp. 1-28 ◽

Cited By ~ 3

Author(s):

Corinne Herlihy ◽

Ezra Karger ◽

Cynthia Pollard ◽

Heather C. Hill ◽

Matthew A. Kraft ◽

...

Keyword(s):

Teacher Evaluation ◽

Reliability And Validity ◽

Teaching Quality ◽

Validity And Reliability ◽

Evaluation Systems ◽

Achievement Data ◽

State Education ◽

Performance Domains ◽

State And Local ◽

Teacher Evaluation Systems

Context In the past two years, states have implemented sweeping reforms to their teacher evaluation systems in response to Race to the Top legislation and, more recently, NCLB waivers. With these new systems, policymakers hope to make teacher evaluation both more rigorous and more grounded in specific job performance domains such as teaching quality and contributions to student outcomes. Attaching high stakes to teacher scores has prompted an increased focus on the reliability and validity of these scores. Teachers unions have expressed strong concerns about the reliability and validity of using student achievement data to evaluate teachers and the potential for subjective ratings by classroom observers to be biased. The legislation enacted by many states also requires scores derived from teacher observations and the overall systems of teacher evaluation to be valid and reliable. Focus of the Study In this paper, we explore how state education officials and their district and local partners plan to implement and evaluate their teacher evaluation systems, focusing in particular on states’ efforts to investigate the reliability and validity of scores emerging from the observational component of these systems. Research Design Through document analysis and interviews with state education officials, we explore several issues that arise in observational systems, including the overall generalizability of teacher scores; the training, certification, and reliability of observers; and specifications regarding the sampling and number of lessons observed per teacher. Findings Respondents’ reports suggest that states are attending to the reliability and validity of scores, but inconsistently; in only a few states does there appear to be a coherent strategy regarding reliability and validity in place. Conclusions There remain a variety of system design and implementation decisions that states can optimize to increase the reliability and validity of their teacher evaluation scores. While a state may engage in auditing scores, for instance, it may miss the gains to reliability and validity that would accrue from periodic rater retraining and recertification, a stiff program of rater monitoring, and the use of multiple raters per teacher. Most troublesome are decisions about which and how many lessons to sample, which are either mandated legislatively, result from practical concerns or negotiations between stakeholders, or, at best case, rest on broad research not directly related to the state context. This suggests that states should more actively investigate the number of lessons and lesson sampling designs required to yield high-quality scores.

Download Full-text

Keeping Great Teachers: A Case Study on the Impact and Implementation of a Pilot Teacher Evaluation System

Educational Policy ◽

10.1177/0895904816637685 ◽

2016 ◽

Vol 32 (3) ◽

pp. 363-394 ◽

Cited By ~ 5

Author(s):

Claire Robertson-Kraft ◽

Rosaline S. Zhang

Keyword(s):

Teacher Evaluation ◽

Teacher Retention ◽

Evaluation System ◽

School Level ◽

Evaluation Systems ◽

Teacher Survey ◽

Teacher Evaluation Systems ◽

The Individual ◽

The Impact

A growing body of research examines the impact of recent teacher evaluation systems; however, we have limited knowledge on how these systems influence teacher retention. This study uses a mixed-methods design to examine teacher retention patterns during the pilot year of an evaluation system in an urban school district in Texas. We used difference-in-differences analysis to examine the impact of the new system on school-level teacher turnover and administered a teacher survey ( N = 1,301) to investigate individual and school-level factors influencing retention. This quantitative analysis was supplemented with interview data from two case study schools. Results suggest that, overall, the new evaluation system did not have a significant effect on teacher retention, but there was significant variation at the individual and school level. This study has important implications for policymakers developing new evaluation systems and researchers interested in evaluating their impact on retention.

Download Full-text

Better integrating summative and formative goals in the design of next generation teacher evaluation systems

Education Policy Analysis Archives ◽

10.14507/epaa.28.5024 ◽

2020 ◽

Vol 28 ◽

pp. 63

Author(s):

Timothy G. Ford ◽

Kim Hewitt

Keyword(s):

Teacher Evaluation ◽

Professional Growth ◽

Evaluation System ◽

Self Determination Theory ◽

Next Generation ◽

Policy And Practice ◽

Evaluation Policy ◽

Evaluation Systems ◽

Teacher Evaluation Policy ◽

Teacher Evaluation Systems

In current teacher evaluation systems, the two main purposes of evaluation—accountability/goal accomplishment (summative) and professional growth/improvement (formative)—are often at odds with one another. However, they are not only compatible, but linking them within a unified teacher evaluation system may, in fact, be desirable. The challenge of the next generation of teacher evaluation systems will be to better integrate these two purposes in policy and practice. In this paper, we integrate the frameworks of Self-determination theory and Stronge’s Improvement-Oriented Model for Performance Evaluation. We use this integrated framework to critically examine teacher evaluation policy in Hawaii and Washington, D.C.—two distinctly different approaches to teacher evaluation—for the purposes of identifying a set of clear recommendations for improving the design and implementation of teacher evaluation policy moving forward.

Download Full-text

An Analysis of Principal Perceptions of the Primary Teaching Evaluation System Used in Eight U.S. States

International Journal of Education Policy and Leadership ◽

10.22230/ijepl.2017v12n5a773 ◽

2017 ◽

Vol 12 (5) ◽

Cited By ~ 3

Author(s):

Richard L. Dodson

Keyword(s):

Teacher Evaluation ◽

Evaluation System ◽

Online Survey ◽

Teacher Evaluations ◽

Instructional Program ◽

Evaluation Instrument ◽

Public School Principals ◽

Evaluation Systems ◽

New Instrument ◽

Teacher Evaluation Systems

This research examines how public school principals in eight U.S. states perceive their teacher evaluation systems which are based on Charlotte Danielson’s Framework for Teaching (FfT). States were selected to represent high, middle, and low scorers in the annual Education Week “Quality Counts” report (Education Week, 2016). 1,142 out of over 8,100 working principals in the eight states responded to an online survey, yielding a response rate of over 14%. Most principals were not satisfied with FfT and found implementing the system too cumbersome. Responses suggested an average of two changes to FfT desired by each principal; few wanted to keep their FfT as is. Targets for improvement included overhauling software used to enter teacher evaluations; eliminating student growth goals and student test scores (VAMs) as part of evaluations; reducing the time and paperwork required; and wanting more training for administrators and teachers on the use of FfT. Some states’ principals wanted to return control over teacher evaluation systems to local school districts. Most respondents agreed that their version of FfT has improved their school’s instructional program, and they prefer the new instrument over their previous evaluation instrument.

Download Full-text

Can Principals Promote Teacher Development as Evaluators? A Case Study of Principals’ Views and Experiences

Educational Administration Quarterly ◽

10.1177/0013161x16653445 ◽

2016 ◽

Vol 52 (5) ◽

pp. 711-753 ◽

Cited By ~ 57

Author(s):

Matthew A. Kraft ◽

Allison F. Gilmour

Keyword(s):

Teacher Evaluation ◽

Teacher Development ◽

Evaluation System ◽

Evaluation Process ◽

Unintended Consequences ◽

Urban District ◽

Evaluation Systems ◽

Primary Responsibility

Purpose: New teacher evaluation systems have expanded the role of principals as instructional leaders, but little is known about principals’ ability to promote teacher development through the evaluation process. We conducted a case study of principals’ perspectives on evaluation and their experiences implementing observation and feedback cycles to better understand whether principals feel as though they are able to promote teacher development as evaluators. Research Method: We conducted interviews with a stratified random sample of 24 principals in an urban district that recently implemented major reforms to its teacher evaluation system. We analyzed these interviews by drafting thematic summaries, coding interview transcripts, creating data-analytic matrices, and writing analytic memos. Findings: We found that the evaluation reforms provided a common framework and language that helped facilitate principals’ feedback conversations with teachers. However, we also found that tasking principals with primary responsibility for conducting evaluations resulted in a variety of unintended consequences which undercut the quality of evaluation feedback they provided. We analyze five broad solutions to these challenges: strategically targeting evaluations, reducing operational responsibilities, providing principal training, hiring instructional coaches, and developing peer evaluation systems. Implications: The quality of feedback teachers receive through the evaluation process depends critically on the time and training evaluators have to provide individualized and actionable feedback. Districts that task principals with primary responsibility for conducting observation and feedback cycles must attend to the many implementation challenges associated with this approach in order for next-generation evaluation systems to successfully promote teacher development.

Download Full-text

Enacting the Rubric: Teacher Improvements in Windows of High-Stakes Observation

Education Finance and Policy ◽

10.1162/edfp_a_00295 ◽

2019 ◽

pp. 1-51

Author(s):

Aaron R. Phipps ◽

Emily A. Wiseman

Keyword(s):

Teacher Evaluation ◽

Evaluation System ◽

Teacher Practices ◽

Causal Link ◽

High Stakes ◽

Evaluation Program ◽

Evaluation Systems ◽

Post Evaluation ◽

Teacher Evaluation Systems ◽

Teacher Evaluation Program

Teacher evaluation systems that use in-class observations, particularly in high-stakes settings, are frequently understood as accountability systems intended as non-intrusive measures of teacher quality. Presumably, the evaluation system motivates teachers to improve their practice – an accountability mechanism – and provides actionable feedback for improvement – an information mechanism. No evidence exists, however, establishing the causal link between an evaluation program and daily teacher practices. Importantly, it is unknown how teachers may modify their practice in the time leading up to an unannounced in-class observation, or how they integrate feedback into their practice post-evaluation, a question that fundamentally changes the design and philosophy of teacher evaluation programs. We disentangle these two effects with a unique empirical strategy that exploits random variation in the timing of in-class observations in the Washington, D.C. teacher evaluation program IMPACT. Our key finding is that teachers work to improve during periods in which they are more likely to be observed, and they improve with subsequent evaluations. We interpret this as evidence that both mechanisms are at work, and as a result, policymakers should seriously consider both when designing teacher evaluation systems.

Download Full-text

Exploring the Intended and Unintended Consequences of High-Stakes Teacher Evaluation on Schools, Teachers, and Students

Teachers College Record ◽

10.1177/016146811411600103 ◽

2014 ◽

Vol 116 (1) ◽

pp. 1-29

Author(s):

Alyson Leah Lavigne

Keyword(s):

Student Achievement ◽

Teacher Evaluation ◽

Value Added ◽

Opportunity To Learn ◽

Unintended Consequences ◽

Observation Data ◽

Effective Teacher ◽

High Stakes ◽

Improve Student Achievement ◽

Teacher Workforce

Background/Context The stakes are getting higher for teachers daily as more and more states adopt hiring, firing, and tenure-granting policies based on teacher evaluations. Even more concerning is the limited discussion about whether or not high-stakes teacher evaluation can meet the intended outcome of improved student achievement, and at what cost. These high-stakes decisions are based on the rationale that firing ineffective teachers (as primarily measured by observation data and value-added scores) will improve student achievement. This premise is challenged by various variables and assumptions (e.g., reliability, validity, percentage fired, and turnover) that, if not met, could result in a number of possible unintended consequences. Focus of Study This paper examines the history of high-stakes teacher evaluation and the ways in which teacher evaluation data are being used in today's schools to make human capital decisions. The intended consequences and unintended consequences are explored in great detail. Research Design This paper is an analytic essay. Conclusions/Recommendations There is no evidence that high-stakes teacher evaluation can produce a more effective teacher workforce and improve student achievement. Even if basic requirements and assumptions are met (e.g., highly reliable and valid measures, retention of effective teachers, and highly effective hires), gains in student achievement may be short lived, insignificant, or practically meaningless. The possible unintended consequences could result in worse, rather than better, student achievement outcomes and increase the gap in opportunity to learn for students attending the most and least affluent schools.

Download Full-text

Putting teachers evaluation systems on the map: An overview of state’s teacher evaluation systems post–Every Student Succeeds Act

Education Policy Analysis Archives ◽

10.14507/epaa.28.5252 ◽

2020 ◽

Vol 28 ◽

pp. 58 ◽

Cited By ~ 3

Author(s):

Kevin Close ◽

Audrey Amrein-Beardsley ◽

Clarin Collins

Keyword(s):

Teacher Evaluation ◽

Value Added ◽

Department Of Education ◽

Growth State ◽

Evaluation Systems ◽

Present Information ◽

Value Added Models ◽

Before And After ◽

Teacher Evaluation Systems ◽

Every Student Succeeds Act

The Every Students Succeeds Act (ESSA) loosened the federal policy grip over states’ teacher accountability systems. We present information, collected via surveys sent to state department of education personnel, about all states’ teacher evaluation systems post–ESSA, while also highlighting differences before and after ESSA. We found that states have decreased their use of growth or value-added models (VAMs) within their teacher evaluation systems. In addition, many states are offering more alternatives for measuring the relationships between student achievement and teacher effectiveness besides using test score growth. State teacher evaluation plans also contain more language supporting formative teacher feedback. States are also allowing districts to develop and implement more unique teacher evaluation systems, while acknowledging challenges with states’ being able to support varied systems, as well as incomparable data across schools and districts in effect.

Download Full-text

A Phenomenological Exploration of How Public School Administrators Balance Teacher Evaluation and Support of Teacher Growth

10.28971/242017sm135 ◽

2017 ◽

Author(s):

Mary Slattery

Keyword(s):

Student Achievement ◽

Teacher Evaluation ◽

Rhode Island ◽

Evaluation System ◽

Teacher Growth ◽

Evaluation Tools ◽

School Factors ◽

High Stakes ◽

Evaluation Systems ◽

The Evaluation System

<p>Educational reform efforts over the past several decades have focused on different ways to address student achievement. Since the quality of the teacher that students have is one of the biggest in-school factors that impact student achievement, schools have focused on identifying, measuring, and improving teacher quality. Some of those reform efforts have focused their attention on teacher evaluation as a way of increasing student achievement. There has been much research on the components quality evaluation tools. There is also research on ways to impact and change teacher practice. Unfortunately most evaluation tools do not impact teaching practice. Rhode Island administrators are current facing the challenge of implementing a high stakes evaluation system while also finding ways to help support growth and development of their teachers. This phenomenological study used open-ended interviews to understand how six administrators in Rhode Island negotiate that complexity. Administrators interviewed emphasized the importance of developing a trusting positive climate and utilizing this climate, along with various components of the evaluation system, to provide teachers with the kinds of support that will impact teacher growth and practice in the classroom. Administrators also expressed their frustration at some of the elements of the evaluation system and limitations they have found in their ability to impact teacher change. Findings from this study have implications for those revising or creating educator evaluation systems, as well as for administrators who must use high stakes evaluation systems while simultaneously attempting to impact teacher growth, development and change in practice.</p>

Download Full-text

The Consistency of Composite Ratings of Teacher Effectiveness: Evidence From New Mexico

American Educational Research Journal ◽

10.3102/0002831219841369 ◽

2019 ◽

Vol 56 (6) ◽

pp. 2116-2146 ◽

Cited By ~ 1

Author(s):

Sy Doan ◽

Jonathan D. Schweig ◽

Kata Mihaly

Keyword(s):

Teacher Evaluation ◽

Teacher Effectiveness ◽

Teacher Quality ◽

Evaluation System ◽

High Stakes ◽

Evaluation Systems ◽

Composite Rating ◽

Teacher Evaluation Systems ◽

Measures Of Performance ◽

Original Rating

Contemporary teacher evaluation systems use multiple measures of performance to construct ratings of teacher quality. While the properties of constituent measures have been studied, little is known about whether composite ratings themselves are sufficiently reliable to support high-stakes decision making. We address this gap by estimating the consistency of composite ratings of teacher quality from New Mexico’s teacher evaluation system from 2015 to 2016. We estimate that roughly 40% of teachers would receive a different composite rating if reevaluated in the same year; 97% of teachers would receive ratings within ±1 level of their original rating. We discuss mechanisms by which policymakers can improve rating consistency, and the implications of those changes to other properties of teacher evaluation systems.

Download Full-text