Grading the Teachers

istock_000014618325small1This recent opinion piece from The New York Times explores the evaluation—or rather, an instance of miscalculation—of teachers based on a rather complex formula.

According to her principal and her students, Ms. Isaacson is a “wonderful” and “terrific” middle school teacher who makes the material “much easier to learn.” Yet, when the stats of her students’ performance are plugged into a formula meant to weed out “bad teachers,” Ms. Isaacson came out in the 7th percentile. This result seems incredibly low for a universally popular teacher, and with her tenure at stake, we might ask whether a mistake has been made: Was there a math error in the formula, or were her peers and students somehow mistaken or biased?

Neither seems to be the case. We don’t have any particularly good reason to think that the arithmetic of the formula was incorrect, nor should we doubt the sincere reports of her principal and her students. It seems like a genuine case of disagreement between the more objective, numbers-based assessment, and the more subjective evaluations.

In this situation, it seems that we should trust the overwhelmingly positive feedback of the principal and the students. This is supported by the fact that her students score high raw scores on standardized tests, which leads me to think that the formula calculations somehow undervalues her contributions. I can’t claim to understand how the formula works, though I suspect that it measures “improvement” rather than the raw scores, which makes it difficult for Ms. Isaacson to score high because most of her students are already at a high level.

What about a hypothetical situation in which a teacher scores well on the formula test but receive unsatisfactory reviews from administrators and students? Would we say that the subjective reports are somehow mistaken? I don’t think that’s the case. Even though it is notoriously difficult to put a number score to the measurement of, say, “inspiration,” it is much harder to “fake” being inspired or to “pretend” that the teacher was duller than she really was (assuming, of course, that the respondents aren’t lying).

The further practical concern is: should the teacher in the second case become tenured? To answer this question, I think we will need to examine the deeper conflict between objective and subjective assessments. In cases of disagreement, should we rely on the hard numbers, or listen to the testimony of others?