Teacher Evaluation Debate Kicked Up by Gates Project Hits Colorado

If a person asked why he is doing something gives the response, “Because everyone else is doing it,” that usually won’t pass muster. If that person happens to be 5 years old, even if an accomplished blogging prodigy, you’d cut them a little slack… right? Today, it seems like everyone out there has something to say about the Gates Foundation’s newly released findings from the Measures of Effective Teaching (MET) Project.

Ed News Colorado’s Julie Poppen highlights a study conclusion that meshes very well with Colorado’s SB 191 teacher evaluation reform:

“Our data suggest that assigning 50 percent or 33 percent of the weight to state test results maintains considerable predictive power, increases reliability and potentially avoids the unintended negative consequences from assigning too-heavy weights to a single measure,” the much-awaited MET study found.

The implementation of SB 191 by school districts looms large across Colorado’s education policy landscape. Local boards are empowered to stop the “Dance of the Lemons” with ineffective teachers, and to put in place compensation systems that truly reward performance. By and large, these represent positive developments. But does the new Gates study really validate the law’s formula that 50 percent of teacher evaluations should be tied to academic growth?

“Not so fast,” one prominent academic critic tells Wall Street Journal reporter Stephanie Banchero:

Jay P. Green [sic], a professor of education policy at the University of Arkansas, called the Gates research a “political document and not a research document.” He said the research doesn’t support that classroom observations are a strong predictor of quality teaching.

“But the Gates Foundation knows that teachers and others are resistant to a system that is based too heavily on student test scores, so they combined them with other measures to find something that was more agreeable to them,” he said

At his blog, Dr. Greene fleshes out his objections and points to evidence from the Gates project’s own research to argue that “[c]lassroom observations make virtually no independent contribution to the predictive power of a teacher evaluation system.”

But maybe there’s more to learn about how observations are done than whether they are valid per se. Philissa Cramer at Gotham Schools notes the finding that having multiple observers produces more reliable results of teacher performance than having multiple observations by the same person. Like the call to use multiple measures of student academic growth, this lessons shows up in the MET Project’s guiding principles for school leaders.

One thing is for sure. Even if there are conflicting interpretations, the scope of the project (covering evaluations of 3,000 teachers in six major districts, including Denver) presents some significant data. Roughly $45 million later, we may have a clearer picture of how to determine a teacher’s future effectiveness. Past performance still doesn’t always predict future results, but looking at the trend of how an instructor improves student test scores remains the best tool at our disposal.