Learning from what doesn’t work in teacher evaluation

You’ve accessed your 3 free items for this month. To access all of our content,
please upgrade your membership or become a member.
KEVIN CLOSE (kclose1@asu.edu) is a doctoral student in the Learning, Literacies, and Technologies Program at Mary Lou Fulton Teachers College at Arizona State University in Tempe.
AUDREY AMREIN-BEARDSLEY (audrey.beardsley@asu.edu) is a professor in the Educational Policy and Evaluation Program at Mary Lou Fulton Teachers College, Arizona State University. She is the author of Rethinking Value-Added Models in Education: Critical Perspectives on Tests and Assessment-Based Accountability (Routledge, 2014) and coeditor of Student Growth Measures in Policy and Practice: Intended and Unintended Consequences of High-Stakes Teacher Evaluations (Palgrave, 2016).


  • Joel Berg, Ph.D.

    Last sentence: “evaluation” not “education”.

    The best teaching is the improvement of attitudes, leading to eagerness and enjoyment in learning. Abilities will develop in all subject areas under this emphasis.

    A fine article.

  • Laura H. Chapman

    Then there is the fact that the VAM rituals are not applicable to teachers for whom there are not standardized state tests. The last estimates for that population was about 69%. THe distortion of the whole of education is too rarely noted in all of the data-chasing on behalf of scores in reading and math, occassionally science, perhaps social studies. The truncated test-driven curriculum has been enabled by the attention given to VAM. I am grateful that the sham factors in VAM are being exposed. I work in Ohio where all teacher evaluations are a sham, especially EVASS, our version of VAM, and the infamous SLO writing exercise for teachers of “untested subjects.”

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

What can work in teacher evaluation: Lessons from Boys in the Boat

Daniel James Brown’s 2013 bestseller, The Boys in the Boat, illustrates four powerful ideas that ought to inform our teacher evaluation systems.


You would think by now both educators and policy makers would have learned a lesson or two about what does not work in teacher evaluation. In recent years, both the federal government and the Gates Foundation have put results-oriented teacher evaluation at the front and center of the school reform movement. However, as Kevin Close and Audrey Amrein-Beadsley describe in their article, using students’ scores on standardized achievement tests to assess teaching effectiveness has proven to be quite problematic. Not only are the statistics behind value-added models unreliable and biased but administrators are “tempted” to align their assessment of teachers’ classroom practice to the VAM scores they’ve already received.

While Close and Amrein-Beadsley draw upon the arguments and decisions made in 15 lawsuits that have raised questions about the legal legitimacy of VAM to judge teachers, their analyses comport neatly with a recent 600-page RAND report on Gates’ investments in the Intensive Partnerships (IP) for Effective Teaching. In short, RAND found “no evidence” that the use of test-based teacher evaluations led to improved student learning and retention of effective teachers at the IP sites, which included three school districts and four charter management organizations. Perhaps due to the high burdens placed on principals’ time and the incomplete (and sometimes inaccurate) data produced by the tests, the sites were not able to improve the effectiveness of their current teachers through their systems of coaching, mentoring, and professional development.

As I reflect on the analyses by Close and Amrein-Beadsley as well as RAND’s post-mortem of the IP, I cannot help but think back to another RAND report on teacher evaluation, this one published in 1984, revealing how the data were rarely used to inform school priorities as well as how “evaluation processes needed to yield descriptive information that illuminates sources of difficulty, as well as viable courses for change.”

I also have to consider recent studies from numerous scholars — such as John Papay, Matt Ronfeldt, and Alan Daly — who have shown how teacher collaboration and networking power up student learning. We now know much about how teachers, especially those teaching diverse, high-need students, improve their pedagogical practices and take instructional risks when they learn from other teaching colleagues they trust. Unfortunately, though, as found by yet another recent RAND investigation fewer than one in three teachers in the U.S. have sufficient time to collaborate with their teaching colleagues, and 44% report that, in a typical month, they never observe another teacher’s classroom to get ideas for instruction or to offer feedback.

We should rethink the assumption that teacher evaluation must focus on assessing and developing the skills of individual teachers — it can be more effective to cultivate teams of teachers.

At the same time, Close and Amrein-Beardsley’s article also brings to my mind a decidedly non-academic publication, Daniel James Brown’s 2013 bestseller The Boys in the Boat, which recounts the stunning gold-medal performance by the University of Washington rowing crew at the 1936 Berlin Olympics. What does a poignant story about a rowing team have to do with rethinking teacher evaluation? Central to the narrative is Joe Rantz, one of the UW crew members, who overcame a hardscrabble Depression-era childhood to become a successful engineer. But the legendary come-from-behind victory in the Berlin games was not about his individual effort, or that of any of his eight teammates, but about how, over time, they became attuned to one another’s performance and well-being. In fact, Joe was at one time the “weak link in the crew” who often “struggled to master the technical side of the sport,” but he and his crewmates were “fiercely determined” to make sure none of them failed. And it wasn’t just a matter of teamwork. The Boys would not have won the gold medal without the right kind of technical and moral support from their coaches as well as the engineered precision of the boat, The Husky Clipper, which was designed to maximize the particular strengths of its crew members.

Brown’s story illustrates at least four powerful ideas that ought to inform the next generation of teacher evaluation systems:

  1. We should rethink the assumption that teacher evaluation must focus on assessing and developing the skills of individual teachers — it can be more effective to cultivate teams of teachers, bringing individuals together in ways that meet their students’ particular needs;
  2. We should redesign schools to ensure that the most accomplished practitioners — both teachers and administrators — have opportunities to share ideas and provide leadership;
  3. We should retool our teaching evaluation rubrics so that they rate teachers at the top of the scale only if they share their expertise with their colleagues; and
  4. We should repurpose the job of the school principal to encourage administrators to cultivate teacher leadership and reward them for doing so.

These recommendations are not pie-in-the-sky. They are being played out today in school systems both overseas and here in the U.S. For example, in Singapore, teacher evaluation is not a checklist but a narrative that begins with self-assessment and focuses on contributions to the holistic development of students. Master teachers are identified only when they demonstrate how they help their colleagues improve.

Closer to home, in the Pomona (Calif.) Unified School District (where my organization provides technical assistance), district administrators are shifting professional development to emphasize teacher voice and choice and offer opportunities for teachers to coteach, redesign learning environments for students, and take time to share standards-based lessons and resources with their colleagues. Early evidence has shown that this approach is improving student achievement among the district’s high-need students. And now the district and the union are beginning to reshape teacher evaluation to focus on both individual growth and teamwork that leads to more equitable student outcomes.

Over the last three decades, we have learned a great deal about teacher evaluation. Researchers have surfaced many lessons from a wide range of statistical analyses, legal decisions, and qualitative studies. But the future of teacher evaluation may be best informed by how nine boys in a boat developed and shared responsibility among themselves — with supportive coaches and a well-designed shell to match their strengths — and became the #1 crew in the world.


Daly, A., Moolenaar, N., Der-Martrosian, C., & Liou, Y. (2014). Accessing capital resources: Investigating the effects of teacher human and social capital on student achievement. Teachers College Record, 116 (7), 1-42.

Johnston, W.R. & Tsai, T. (2018). The prevalence of collaboration among American teachers: National findings from the American Teacher Panel. Santa Monica, VA: RAND Corporation.

Papay, J., Taylor, E.S., Tyler, J., & Laski, M. (2016, February). Learning job skills from colleagues at work: Evidence from a field experiment using teacher performance data. Cambridge, MA: National Bureau of Economic Research.

Ronfeldt, M., Farmer, S.O., McQueen, K., & Grissom, J. (2015). Teacher collaboration in instructional teams and student achievement. American Educational Research Journal, 52 (3), 475-514.

Stecher, B. et al. (2018). Improving teaching effectiveness: Final report of the Intensive Partnerships for Effective Teaching through 2015–2016. Santa Monica, CA: RAND Corporation.

Wise, A., Darling-Hammond, L., Tyson-Bernstein, H., & McLaughlin, M. (1984). Teacher evaluation: A study of effective practices. Washington DC: RAND Corporation.

BARNETT BERRY (bberry@teachingquality.org) is the founder of the Center for Teaching Quality, a national nonprofit dedicated to igniting change inside of public education in order to transform student learning.
WP_User Object ( [data] => stdClass Object ( [ID] => 94 [user_login] => bbery [user_pass] => $P$BKM4T1Du2EkKQU5tzsuqT5N4G8drB0/ [user_nicename] => bbery [user_email] => bberry@fake.fake [user_url] => [user_registered] => 2018-08-27 19:51:53 [user_activation_key] => [user_status] => 0 [display_name] => Barnett Berry [type] => wpuser ) [ID] => 94 [caps] => Array ( [author] => 1 ) [cap_key] => wp_capabilities [roles] => Array ( [0] => author ) [allcaps] => Array ( [upload_files] => 1 [edit_posts] => 1 [edit_published_posts] => 1 [publish_posts] => 1 [read] => 1 [level_2] => 1 [level_1] => 1 [level_0] => 1 [delete_posts] => 1 [delete_published_posts] => 1 [author] => 1 ) [filter] => [site_id:WP_User:private] => 1 ) 94 | 94