‘Beatings Must Continue Until Morale Improves’

The ‘Value-Added Measurement’ movement in American education, implemented in part by the now-disgraced Michelle Rhee here in DC, has been a complete and utter failure, even as measured by its own yardsticks, as you will see below

Yet, the same corporate ‘reformers’ who were its major cheerleaders do not conclude from this that the idea was a bad one. Instead, they claim that it wasn’t tried with enough rigor and fidelity.

From “Schools Matter“:


How to Learn Nothing from the Failure of VAM-Based Teacher Evaluation

The Annenberg Institute for School Reform is a most exclusive academic club lavishly funded and outfitted at Brown U. for the advancement of corporate education in America. 

The Institute is headed by Susanna Loeb, who has a whole slew of degrees from prestigious universities, none of which has anything to do with the science and art of schooling, teaching, or learning.  

Researchers at the Institute are circulating a working paper that, at first glance, would suggest that school reformers might have learned something about the failure of teacher evaluation based on value-added models applied to student test scores. The abstract:

Starting in 2009, the U.S. public education system undertook a massive effort to institute new high-stakes teacher evaluation systems. We examine the effects of these reforms on student achievement and attainment at a national scale by exploiting the staggered timing of implementation across states. We find precisely estimated null effects, on average, that rule out impacts as small as 1.5 percent of a standard deviation for achievement and 1 percentage point for high school graduation and college enrollment. We also find little evidence of heterogeneous effects across an index measuring system design rigor, specific design features, and district characteristics. [my emphasis – GFB]

So could this mean that the national failure of VAM applied to teacher evaluation might translate to decreasing the brutalization of teachers and the waste of student learning time that resulted from the implementation of VAM beginning in 2009?

No such luck.   

The conclusion of the paper, in fact, clearly shows that the Annenbergers have concluded that the failure to raise test scores by corporate accountability means (VAM) resulted from laggard states and districts that did not adhere strictly to the VAM’s mad methods.  In short, the corporate-led failure of VAM in education happened as a result of schools not being corporate enough:

Firms in the private sector often fail to implement best management practices and performance evaluation systems because of imperfectly competitive markets and the costs of implementing such policies and practices (Bloom and Van Reenen 2007). These same factors are likely to have influenced the design and implementation of teacher evaluation reforms. Unlike firms in a perfectly competitive market with incentives to implement management and evaluation systems that increase productivity, school districts and states face less competitive pressure to innovate. Similarly, adopting evaluation systems like the one implemented in Washington D.C. requires a significant investment of time, money, and political capital. Many states may have believed that the costs of these investments outweighed the benefits. Consequently, the evaluation systems adopted by many states were not meaningfully different from the status quo and subsequently failed to improve student outcomes.

So the Gates-Duncan RTTT corporate plan for teacher evaluation failed not because it was a corporate model but because it was not corporate enough!  In short, there were way too many small carrots and not enough big sticks.

More Cold Water Thrown On “Value-Added” Modeling

A number of academic researchers have shown that ‘value-added’ methodologies are nearly useless in providing any information that principals, teachers, and other policy makers can actually use to make any decisions whatsoever.  One such researcher is Sean Corcoran. I reprint here a press release from two years ago, sponsored by the Annenberg Institute, (link  to original document)

Unfortunately, the links in the press release don’t seem to work. Here is a link to the full report by Dr. Corcoran.

AISR logo Press Release

September 16, 2010


Sean Patrick Corcoran
New York University
(212) 992-9468
Warren Simmons
Annenberg Institute for School Reform
(410) 863-7675

NEW YORK — Value-added assessments of teacher effectiveness are a “crude indicator” of the contribution that teachers make to their students’ academic outcomes, asserts Sean P. Corcoran, assistant professor of educational economics at New York University’s Steinhardt School of Culture, Education and Human Development, and research fellow at the Institute for Education and Social Policy, in a paper issued today as part of the “Education Policy for Action” series of research and policy analyses by scholars convened by the Annenberg Institute for School Reform at Brown University.

“The promise that value-added systems can provide a precise, meaningful and comprehensive picture is much overblown,” argues Corcoran whose research report is entitled Can Teachers be Evaluated by Their Students’ Test Scores? Should they Be? The Use of Value-Added Measures of Teacher Effectiveness in Policy and Practice. “Teachers, policy-makers and school leaders should not be seduced by the elegant simplicity of value-added measures. Policy-makers, in particular, should be fully aware of their limitations and consider whether their minimal benefits outweigh their cost.”

> To view the entire report visit

> The “Education Policy for Action” series is funded by a grant from
the Robert Sterling Clark Foundation.

Value-added models — the centerpiece of a national movement to evaluate, promote, compensate and dismiss teachers based in part on their students’ test scores — have proponents throughout the country, including school systems in New York City, Chicago, Houston and Washington, D.C. In theory, a teacher’s “value-added” is the unique contribution he or she makes to students’ achievement that cannot be attributed to any other current or past student, family, teacher, school, peer or community influence. In practice, states Corcoran, it is exceptionally difficult to isolate a teacher’s unique effect on academic achievement.

“The successful use of value-added requires a high level of confidence in the attribution of achievement gains to specific teachers,” he says. “Given one year of test scores, it’s impossible to distinguish between the teacher’s effect and other classroom-specific factors. Over many years, the effects of other factors average out, making it easier to infer a teacher’s impact. But this is little comfort to a teacher or school leader searching for actionable information today.”

In October 2009, the National Academies’ National Research Council issued a statement that applauded the Department of Education’s proposed use of assessment systems that link student achievement to teachers in Race to the Top initiatives, but cautioned the use of value-added approaches for evaluation purposes, citing that “too little research has been done on these methods’ validity to base high-stakes decisions about teachers on them.”

Corcoran’s research examines the value-added systems used in New York City’s Teacher Data Reports, and Houston’s ASPIRE program (Accelerating Student Progress, Increasing Results and Expectations). Among his concerns surrounding them, he concludes that the standardized tests used to support these systems are inappropriate for value-added measurement.

“Value-added assessment works best when students are able to receive a single numeric test score every year on a continuous developmental scale,” states Corcoran, meaning that the scale does not depend on grade-specific content but rather progresses across grade levels. Neither the Texas nor New York state test was designed on such a scale. Moreover, the set of skills and subjects that can be adequately assessed in this way is remarkably small, he argues, suggesting that value-added systems will ignore much of the work teachers do.

“Not all subjects are or can be tested, and even within tested subject areas, only certain skills readily conform to standardized testing,” he says. “Despite that, value-added measures depend exclusively on such tests. State tests are often predictable in both content and format, and value-added rankings will tend to reward those who take the time to master the predictability of the test.”In practice, the biggest obstacle to value-added assessments is their high level of imprecision, he argues.

“A teacher ranked in the 43rd percentile on New York City’s Teacher Data Report may have a range of possible rankings from the 15th to the 71st percentile after taking statistical uncertainty into account,” says Corcoran. He finds that the majority of teachers in New York City’s Teacher Data Reports cannot be statistically distinguished from the 60 percent or more of other teachers in the district.

“With this level of uncertainty, one cannot differentiate between below average, average, and above average teachers with confidence. At the end of the day, it’s isn’t clear what teachers and their principals are supposed to do with this information.”

Corcoran grants that some uncertainty is inevitable in value-added measurement but questions whether value-added measures are precise enough to be useful in high-stakes decision-making or even for professional development. Novice teachers have the most to gain from performance feedback, he contends, yet value-added scores for these teachers are the least reliable.

The notion that a statistical model could isolate each teacher’s unique contribution to their students’ educational outcomes is a powerful one, acknowledges Corcoran. With such information, one could not only devise systems that reward teachers with demonstrated records of classroom success and remove teachers who do not, but also create a school climate in which teachers and principals work constructively with their test results to make positive instructional and organizational changes.

“Few can deny the intuitive appeal of these tools,” says Corcoran. “Teacher quality is an immensely important resource, and research has found that teachers can and do vary in their effectiveness. However, these evaluation tools have limitations and shortcomings that are not understood or apparent to interested stakeholders, or even to value-added advocates.”

Adds Corcoran: “Research on value-added remains in its infancy, and it is likely that these methods — and the tests on which they are based — will continue to improve over time. The simple fact that teachers and principals are receiving regular and timely feedback on their students’ achievement is an accomplishment in and of itself. It’s hard to argue that stimulating conversation around improving student achievement is not a positive thing, but teachers, policy-makers, and school leaders should not be seduced by the simplicity of value added.”

# # #© Annenberg Institute for School Reform

Published in: on March 21, 2012 at 12:38 pm  Leave a Comment  
Tags: , ,
%d bloggers like this: