This comment was posted yesterday:
I am a former, part time item writer for a private testing company; I wrote for many different state standards under NCLB. I must say that poorly constructed, confusing, or developmentally inappropriate items undermine the validity of standardized scores and subsequent use in teacher evaluation. When standardized tests are properly constructed, such items which might make it to a field test will almost certainly be vetted during what is typically a two year process. Many items on the Pearson math and ELA administered last April here in NY were written, in my opinion, in an intentionally confusing style using obtuse or arcane vocabulary. The ELA test in particular included confusing item stems and distractors that were not clearly wrong. There were far too many items that turned subjective opinions (most likely; best; author’s intent; etc.) into a “one right, three wrong” format. Many teachers were unsure of the correct answers on a number of vague and fuzzy items.
After reviewing the changes in math and reading scores at all DC public schools for 2006 through 2009, I have come to the conclusion that the year-to-year school-wide changes in those scores are essentially random. That is to say, any growth (or slippage) from one year to the next is not very likely to be repeated the next year.
Actually, it’s even worse than that.The record shows that any change from year 1 to year 2 is somewhat NEGATIVELY correlated to the changes between year 2 and year 3. That is, if there is growth from year 1 to year 2, then, it is a bit more likely than not that there will be a shrinkage between year 2 and year 3. Or, if the scores got worse from year 1 to year 2, they there is a slightly better-than-even chance that the scores will improve the following year.
And it doesn’t seem to matter whether the same principal is kept during all three years, or whether the principals are replaced one or more times over the three-year period.
In other words, all this shuffling of principals (and teachers) and turning the entire school year into preparation for the DC-CAS seems to be futile. EVEN IF YOU BELIEVE THAT THE SOLE PURPOSE OF EDUCATION IS TO PRODUCE HIGH STANDARDIZED TEST SCORES. (Which I don’t.)
Don’t believe me? I have prepared some scatterplots, below, and you can see the raw data here as a Google Doc.
My first graph is a scatterplot relating the changes in percentages of students scoring ‘proficient’ or better on the reading tests from Spring 2006 to Spring 2007 on the x-axis, with changes in percentages of students scoring ‘proficient’ or better in reading from ’07 to ’08 on the y-axis, at DC Public Schools that kept the same principals for 2005 through 2008.
If there were a positive correlation between the two time intervals in question, then the scores would cluster mostly in the first and third quadrants. And that would mean that if scores grew from ’06 to ’07 then they also grew from ’07 to ’08; or if they went down from ’06 to ’07, then they also declined from ’07 to ’08.
But that’s not what happened. In fact, in the 3rd quadrant, I only see one school – apparently M.C.Terrell – where the scores went down during both intervals. However, there are about as many schools in the second quadrant as in the first quadrant. Being in the second quadrant means that the scores declined from ’06 to ’07 but then rose from ’07 to ’08. And there appear to be about 7 schools in the fourth quadrant. Those are schools where the scores rose from ’06 to ’07 but then declined from ’07 to ’08.
I asked Excel to calculate a regression line of best fit between the two sets of data, and it produced the line that you see, slanted downwards to the right. Notice that R-squared is 0.1998, which is rather weak. If we look at R, the square root of R-squared, that’s the regression constant, my calculator gives me -0.447, which means again that the correlation between the growth (or decline) from ’06 to ’07 is negatively correlated to the growth (or decline) from ’07 to ’08 – but not in a strong manner.
OK. Well, how about during years ’07-’08-’09? Maybe Michelle Rhee was better at picking winners and losers than former Superintendent Janey? Let’s take a look at schools where she allowed the same principal to stay in place for ’07, ’08, and ’09:
Actually, this graph looks worse! There are nearly twice as many schools in quadrant four as in quadrant one! That means that there are lots of schools where reading scores went up between ’07 and ’08, but DECLINED from ’08 to ’09; but many fewer schools where the scores went up both years. In the second quadrant, I see about four schools where the scores declined from ’07 to ’08 but then went up between ’08 and ’09. Excel again provided a linear regression line of best fit, and again, the line slants down and to the right. R-squared is 0.1575, which is low. R itself is about -0.397, which is, again, rather low.
OK, what about schools where a principal got replaced? If you believe that all veteran administrators are bad and need to be replaced with new ones with limited or no experience, you might expect to see negative correlations, but with positive overall outcomes; in other words, the scores should cluster in the second quadrant. Let’s see if that’s true. First, reading changes over the period 2006-’07-’08:
Although there are schools in the second quadrant, there are also a lot in the first quadrant, and I also see more schools in quadrants 3 and 4 than we’ve seen in the first two graphs. According to Excel, R-squared is extremely low: 0.0504, which means that R is about -0.224, which means, essentially, that it is almost impossible to predict what the changes would be from one year to the next.
Well, how about the period ’07-’08-’09? Maybe Rhee did a better job of changing principals then? Let’s see:
Nope. Once again, it looks like there are as many schools in quadrant 4 as in quadrant 1, and considerably fewer in quadrant 2. (To refresh your memory: if a school is in quadrant 2, then the scores went down from ’07 to ’08, but increased from ’08 to ’09. That would represent a successful ‘bet’ by the superintendent or chancellor. However, if a school is in quadrant 4, that means that reading scores went up from ’07 to ’08, but went DOWN from ’08 to ’09; that would represent a losing ‘bet’ by the person in charge.) Once again, the line of regression slants down and to the right. The value of R-squared, 0.3115, is higher than in any previous scatterplot (I get R = -0.558) which is not a good sign if you believe that superintendents and chancellors can read the future.
Perhaps things are more predictable with mathematics scores? Let’s take a look. First, changes in math scores during ’06-’07-’08 at schools that kept the same principal all 3 years:
Notice that every single one of these graphs presented a weak negative correlation, with plenty of what I am calling “losing bets” – by which I mean cases where the scores went up from the first year to the second, but then went down from the second year to the third.
OK. Perhaps it’s not enough to change principals once every 3 or 4 years. Perhaps it’s best to do it every year or two? (Anybody who has actually been in a school knows that when the principal gets replaced frequently, then it’s generally a very bad sign. But let’s leave common sense aside for a moment.) Here we have scatterplots showing what the situation was, in reading and math, from ’07 through ’09, at schools that had 2 or more principal changes from ’06 to ’09:
This conclusion is not going to win me lots of friends among those who want to use “data-based” methods of deciding whether teachers or administrators keep their jobs, or how much they get paid. But facts are facts.
A little bit of mathematical background on statistics:
Statisticians say that two quantities (let’s call them A and B) are positively correlated when an increase in one quantity (A) is linked to an increase in the other quantity (B). An example might be a person’s height(for quantity A) and length of a person’s foot (for quantity B). Generally, the taller you are, the longer your feet are. Yes, there are exceptions, so these two things don’t have a perfect correlation, but the connection is pretty strong.
If two things are negatively correlated, that means that when one quantity (A) increases, then the other quantity (B) decreases. An example would be the speed of a runner versus the time it takes to run a given distance. The higher the speed at which the athlete runs, the less time it takes to finish the race. And if you run at a lower speed, then it takes you more time to finish.
And, of course, there are things that have no correlation to speak of.
Today I read an editorial in the Wall Street Journal that said Obama and Duncan aren’t giving Michelle Rhee enough support. That was bad enough, but then I read the comments from the readers. They were downright scary; they sounded like Mussolini-style fascists. Here are some comments I will try to post:
I find it most ironic that a number of readers of the Wall Street Journal are calling public school teachers “selfish” because teachers are overwhelmingly against Rhee-type diktats, which mostly involve losing any form of due process for teachers, and because teachers are not all that interested in potentially earning 6-figure salaries. If teachers were really interested in making lots of money, they would go work on Wall Street, or else become right-wing educational experts like Michelle Rhee. (Duh!)
It might sound old-fashioned, but most teachers would much rather have the satisfaction of knowing that they had actually taught their students well. They would also prefer to hear kind words from their ex-students (and the parents and administrators) for their efforts, rather than spending their time on mindless test-prep activities and on competing with each other for gimmicks that might raise test scores to earn big bonuses. They also believe in fairness and telling the truth, neither of which are virtues exhibited by the current DCPS schools chief.
The public at large has vivid, recent memories of what a single-minded concentration on a single bottom-line number – like a standardized test score – might lead to. Wall Street traders, large banks, mortgage bankers, Enron and all the rest cheated and lied to produce a good, but fictional, bottom line, so they could get those big bonuses. And they nearly crashed the whole world economy along the way. Unfortunately, some school administrators and teachers have already been caught cheating on student test scores; others have probably gotten away with it. This is not the purpose for which most teachers went into the profession.
Let me point out a few other things:
- To improve education in a school system, you need good teachers, and you need a good curriculum. Michelle Rhee has been in power for 2.5 years now, and hasn’t done boo about the curriculum – which needs a lot of work.
- Rhee has fired a lot of principals and teachers, which might make most readers of the Wall Street Journal happy, but anybody who claims that the schools where she replaced principals are doing better is lying. See earlier blog entries for statistical details on this.
- Test scores in DC public schools, both on the DC-CAS and the NAEP, have been rising steadily for many years. In the case of the NAEP, it’s been since the mid-1990’s. Rhee had no hand in any of that improvement, but she’s been trying to take the credit. Again, my blog has details.
- Charter schools do absolutely no better than regular public schools, though in DC they get much more money per pupil and are helping to re-segregate our school system. CHARTER SCHOOLS ARE A FAILED EXPERIMENT! GIVE IT UP! FIX THE PUBLIC SCHOOLS!
- One very troubling new symptom of Rhee’s tenure in DCPS over the past TWO years is that the gap between kids at the 10th and the 90th percentiles, or between the 25th and 75th percentiles, or between the poor and the non-poor, or between white students and black students, have suddenly started to grow a lot, as shown on the NAEP. I’m not sure exactly why this is or how it’s happened, but I do know that it’s not happening in the nation as a whole or in any other big city. The only major change is Rhee and her single-minded insistence on control without accountability, elimination of due process for teachers, and substituting test prep for teaching. See Bill Turque’s excellent article in the Washington Post on 12/13/09.
- If Mr. Obama continues praising Michelle Rhee in particular, and charter schools in general, it will be an extremely sad event for public education in America. I hope he comes to his senses, and replaces Arne Duncan, too.