I haven’t read this yet, but it looks useful:
Teacher VAM scores aren’t stable over time in Florida, either
I quote from a paper studying whether value-added scores for the same teachers tend to be consistent. In other words, does VAM allow us a chance to pick out the crappy teachers and give bonuses to the good one?
The answer, in complicated language, is essentially NO, but here is how they word it:
“Recently, a number of school districts have begun using measures of teachers’ contributions to student test scores or teacher “value added” to determine salaries and other monetary rewards.
In this paper we investigate the precision of valueadded measures by analyzing their inter-temporal stability.
We find that these measures of teacher productivity are only moderately stable over time, with year-to-year correlations in the range of 0.2-0.3.”
Or in plain English, and if you know anything at all about scatter plots and linear correlation, those scores wander all over the place and should never be used to provide any serious evidence about anything. Speculation, perhaps, but not policy or hiring or firing decisions of any sort.
They do say that they have some statistical tricks that allow them to make the correlation look better, but I don’t trust that sort of thing. It’s not real.
Here’s a table from the paper. Look at those R values, and note that if you squared those correlation constants (go ahead, use your calculator on your cell phone) you get numbers that are way, way smaller – like what I and Gary Rubenstein reported concerning DCPS and NYCPS.
For your convenience, I circled the highest R value, 0.61, in middle schools on something called the normed FCAT-SSS, whatever that is (go ahead and look it up if it interests you) in Duval county, Florida, one of the places where they had data. I also circled the lowest R value, 0.07, in Palm Beach county, on the FCAT-NRT, whatever that is.
I couldn’t resist, so 0.56^2 is about 0.31 as an r-squared, which is moderate. There is only one score anywhere near that high 0.56, out of 24 such correlation calculations. The lowest value is 0.07 and if we square that and round it off we get an r-squared value of 0.005, shockingly low — essentially none at all.
The median correlation constant is about 0.285, which I indicated by circling two adjacent values of 0.28 and 0.29 in green. If you square that value you get r^2=0.08, which is nearly useless. Again.
I’m really sorry, but even though this paper was published four years ago, it’s still under wraps, or so it says?!?! I’m not supposed to quote from it? Well, to hell with that. it’s important data, for keereissake!
The title and authors are as follows, and perhaps they can forgive me. I don’t know how to contact them anyway. Does anybody have their contact information? Here is the title, credits, and warning:
THE INTERTEMPORAL STABILITY OF TEACHER EFFECT ESTIMATES *
by
Daniel F. McCaffrey; Tim R. Sass; J. R. Lockwood
The RAND Corporation; Florida State University; The RAND Corporation
Original Version: April 9, 2008
This Version: June 27, 2008
*This paper has not been formally reviewed and should not be cited, quoted, reproduced, or retransmitted without the authors’ permission. This material is based on work supported by a supplemental grant to the National Center for Performance Initiatives funded by the United States Department of Education, Institute of Education Sciences. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of these organizations.

More on VAM (Valueless Abracadabra Mongering)
It’s worth your while to read this article in a paper from New York, concerning a lot of the problems attendent and inherent in the so-called Value Added Measurements.
http://gothamschools.org/2010/09/17/wide-margins-of-error-instability-on-citys-value-added-reports/
A few quotes from the article:
“[…] 31 percent of English teachers who ranked in the bottom quintile of teachers in 2007 had jumped to one of the top two quintile by 2008. About 23 percent of math teachers made the same jump.
“There was an overall correlation between how a teacher scored from one year to the next, and for some teachers, the measurement was more stable. Of the math teachers who ranked in the top quintile in 2007, 40 percent retained that crown in 2008.
“The weaknesses of value-added detailed in the report include:
-
“the fact that value-added scores are inherently relative, grading teachers on a curve — and thereby rendering the goal of having only high value-added teachers ‘a technical impossibility,’ as Corcoran writes
-
“the interference of imperfect state tests, which, when swapped with other assessments, can make a teacher who had looked stellar suddenly look subpar
-
“and the challenge of truly eliminating the influence of everything else that happens in a school and a classroom from that ‘unique contribution’ by the teacher
