One of the things that experimental scientists really should do is to try to replicate each other’s results to see if they are correct or not. I have begun doing that with the value-added scores awarded to teachers in New York City, and I find that I generally agree with the results obtained by Gary Rubenstein.
What I did is looked at the value-added scores, in percentiles, that were “awarded” to thousands of New York City public school teachers in school years 05-06, 06-07, and 07-08. I found that there is essentially no correlation between the scores of the exact same teacher from year to year. The r-squared coefficients are on the order of 0.08 to 0.09 – about as close to random as you can ever get in real life.
Here are my two graphs for the night:
I actually had Excel draw the line of regression, but it’s a joke: an r-squared correlation coefficient of 0.0877 means, as I said, that there is extremely little correlation between what any teacher got in school year 05-06 and what they got in SY 06-07. In the same school. With very similar kids. Teaching the same subject.
And, a similar graph comparing teachers’ scores for school year 06-07 with their scores for 07-08:
So, one year, a teacher might be around the 90th percentile. The next year, she might be around the 10th percentile. Or the other way around. Did the teacher suddenly get stupendously better (or worse)? I doubt it. By the time they are adults, most people are pretty consistent. But not according to this graph. In fact, if somebody is in the 90th to 100th percentile in school year 2006/07, then the probability that they would remain in the same 90th-to-100th-percentile bracket is roughly 1 in 4. If they are in the 0th to 10th percentile in 2006-2007, the chances that they would remain in the same bracket the following year is about 7%!!
What this shows is that using value-added scores to determine if someone should keep their job or get a bonus or a demotion is absolutely insane.