Skip to main content

Edge: A Black and White teacher evaluation gap

Study of teacher evaluation system suggests outside-the-classroom factors explain most of why Black teachers get lower ratings

Researcher: Lauren Sartain
Lauren Sartain cover image

The following article is from the Fall 2021 issue of Edge: Carolina Education Review.

Are Black teachers being penalized by performance evaluations?

The Edge: Teacher evaluations are playing increasingly frequent central roles in promotion, retention, tenure and dismissal decisions. But are the evaluations fair? An evaluation of a teacher evaluation system in Chicago Public Schools called Recognizing Educators Advancing Chicago’s Students — or, REACH — finds that lower performance ratings given to Black teachers can be almost entirely explained by the fact that those teachers are more likely to work in higher-poverty schools. A study of REACH co-led by Lauren Sartain has important implications given the widening demographic and racial gaps between students and their teachers and the shortage of teachers of color in American school classrooms.

Research co-led by Lauren Sartain, assistant professor of educational leadership at the UNC-Chapel Hill School of Education, found that lower performance ratings given to Black teachers in a teacher evaluation system used in Chicago Public Schools can be almost entirely explained by the fact that those teachers are more likely to work in higher-poverty schools.

The study, first published in December 2020 in Educational Evaluation and Policy Analysis, found that most of the difference in performance ratings between Black and White teachers could be explained by the characteristics of the schools in which teachers work. The paper was among a set of policy briefs that won first place as the American Educational Research Association’s Division H’s (Research, Evaluation, and Assessment in Schools) Outstanding Publication in the category Assessment and Accountability.

Watch the authors
The authors discuss their study in a video:

The study’s results have important implications given the widening demographic and racial gaps between students and their teachers, the shortage of teachers of color in American school classrooms, and evidence that minority students realize benefits from being exposed to minority teachers.

Under Chicago’s evaluation system, the typical Black teacher ranked at the 37th percentile in classroom observation scores, while the typical White teacher ranked at the 55th percentile.

But, by controlling for a variety of school, student, and classroom factors — such as socio-economic status, prior-year test scores, and prior-year behavior misconduct — Sartain and co-author Matthew Steinberg found that the Black-White gap in performance ratings disappeared.

Examining policies that affect schooling

Sartain has studied a range of topics around policies and practices that affect teaching in schools, with a focus on work related to equitable access to quality public education. She joined Carolina in 2019, coming to Chapel Hill from the University of Chicago Consortium on School Research where she had worked as a researcher since 2008. She has also worked as a researcher at the University of Chicago Chapin Hall Center for Children.

Employing quantitative methods, Sartain has published and presented on a wide range of topics, including teacher quality, school choice and school quality, and discipline reform. Recent work also includes examinations of affirmative action policies aimed at helping diversify student populations within selective high schools and the effects of school closures on the populations of teachers within school districts.

Under a legislative mandate, Chicago Public Schools joined the nationwide movement to bolster teacher performance systems by adopting a program called Recognizing Educators Advancing Chicago’s Students — or, REACH. REACH launched in the 2012-2013 school year.

REACH replaced a 45-year-old evaluation system that relied on a once-a-year observation that followed a checklist-based approach that sought to rate teacher practice. But the system failed to differentiate teachers by their effectiveness, nor did it provide useful feedback that could help teachers improve their practices.

Under REACH, evaluators — principals or assistant principals — use a detailed rubric to observe and rate teacher practice during multiple classroom observations. The frequency of observations varies depending on whether a teacher has earned tenure status — earned at the start of their fourth year — and prior performance ratings.

Observation scores account for 70% of a teacher’s summative evaluation score. Summative ratings are Unsatisfactory, Developing, Proficient, or Excellent.

The ratings can have high stakes. Dismissal, remediation, and tenure attainment are tied to the ratings. Nontenured teachers with ratings in the bottom two categories may not have their contracts renewed. Tenured teachers with a Developing rating are placed on Professional Development Plans, which are in effect for one year. Tenured teachers with an Unsatisfactory rating are subject to a 90-day Remediation Plan and subject to dismissal if their ratings do not improve. REACH ratings also affect the order in which teachers are laid off.

Sartain previously led a study, published in March 2020 by the University of Chicago Consortium on School Research, that surveyed teachers and administrators regarding their perceptions of the REACH evaluation system. It found that both teachers and administrators agreed that the evaluation system helped identify specific ways to improve practice. Most teachers — more than 80% — felt the observation scores were mostly or highly accurate.

But many teachers disagreed that REACH evaluations should be used to determine dismissal or tenure attainment. Only 15% of administrators disagreed.

Digging into the evaluations

Given that the REACH evaluations can affect the careers of teachers, are they treating all of Chicago’s teachers fairly?

To examine the effects of the REACH evaluation system, Sartain and Steinberg, an associate professor of education policy at George Mason University, analyzed data from the 2013-2014 and 2014-2015 school years, the first years of the system. Data analyzed in the study described 5,536 K–5 teachers from 411 Chicago elementary schools.

For each teacher, Sartain and Steinberg observed demographic information (race, gender, age), years of experience, degree attainment, and tenure status. They matched teachers to their evaluators, observing each evaluator’s demographic information, experience and formal school role. They also used student-level data to match classes of students to teachers.

The study also examined data that described students. That data included each student’s prior-year achievement on standardized end-of-year exams; social-economic status based on whether they qualified for free- or reduced-price lunch; and behavior, based on the number of prior-year misconduct reports.

The study also included data that described school-level climate and instruction supports.

What the data show

The study found that 89% of the Black-White gap in classroom observation scores was explained by differences between the characteristics of schools where Black and White educators worked. The other 11% of the explained gap was related to classroom-level differences within individual schools, including student poverty, misconduct, and academic achievement.

None of the race gap was explained by differences in teachers’ measured effectiveness in improving student achievement, by school culture, or by the race of the teachers’ evaluators.

The study also found that White teachers working in high-poverty schools were just as likely to receive lower evaluations in their classroom observations.

Implications of the study should inform consideration of how teacher evaluation systems are implemented, Sartain and Steinberg say.

If high-stakes personnel decisions rely on observation systems that do not take into account context-specific factors, districts run the risk of making decisions that have the consequence of reducing racial diversity among their teacher labor force.

Sartain and Steinberg say those reductions would likely affect the educational experiences of students, given the benefits that Black and other minority students receive from having same-race teachers.


Sartain, L.; Zou, A.; Gutierrez, V.; Shyjka, A.; Hinton, E.; Brown, E.R. & Easton, J.G. (2020). Teacher evaluation in CPS: Perceptions of REACH implementation, five years in. UChicago Consortium Research Brief.

Sartain, L. & Zou, A. (2020). Teacher evaluation in CPS: REACH ratings and teacher mobility.
Chicago, IL: University of Chicago Consortium on School Research.

Shyjka, A. & Gutiérrez, V. (2020). Teacher evaluation in CPS: What makes evaluator feedback useful? Chicago, IL: University of Chicago Consortium on School Research.

Steinberg, M. P., & Sartain, L. (2021). What explains the race gap in teacher performance ratings? Evidence from Chicago Public Schools. Educational Evaluation and Policy Analysis 43(1), 60–82.