Lists of DC Public Schools With Suspiciously High Wrong-to-Right Erasure Rates

Please note that it was the TESTING COMPANY ITSELF that found all of these erasures over the past three years to be suspicious. They informed the State Superintendent of Education, who has exactly zero power in DC, so she asked Rhee and Henderson and their underlings to investigate. The latter group, naturally, stonewalled, and hushed the whole thing up.

As jaded and as cynical as some of you might think I am, I had no earthly idea that this evident fraud was so widespread. Naturally, Michelle Rhee has now said that she thinks that the USA Today investigation itself is an “insult” to teachers and students. I disagree. I think that everything that Michelle Rhee has done since she quit teaching in Baltimore has been an insult to parents, teachers, students, and honest administrators.

I think that the DC Inspector General’s office needs to investigate this fully and to indict the leaders who caused this fraud to happen.

There was a time that the IG office actually did their job and investigated serious crimes and official misdemeanors in DC; but apparently those days are over. In Atlanta, the investigation was far-reaching and has revealed widespread corruption. See this, this, this, and this. The way they got the ‘goods’ on the higher-ups was the usual: low-level teachers and counselors who did the cheating under orders or threats were offered immunity in exchange for truthful testimony.

Here is a link to the USA Today article. Lots of tables!

Is your school on the list?

By the way, classes were “flagged” as suspicious if the number of wrong-to-right erasures on a test was four or more full  standard deviations above the mean. Four standard deviations (or ‘four sigma’)  is a TREMENDOUSLY HUGE increase over the normal number of such erasures. For comparison, there are typically 2 or 3 such erasures on a single student’s test.

To put it in more familiar terms, think about the average adult man’s height in the USA: about 5 feet 9.5 inches. Anybody within about 3 inches of that (taller or shorter) is within one “sigma” of the mean – and this means, statistically, that about 68% of all adult males are between 5’7″ and 6’1″ – and that includes this writer.

So, ‘one sigma’ in terms of adult US male height is about 3 inches of variation in height.

To be over four sigma higher than the mean adult height is to be a giant.  Here is a little table that will allow you to look at what it means. When you look at it, also try to think about the fact that there are just about one hundred million adult males in the US. (100,000,000) So to be one of those 3,200 people who are four sigma above the mean is to be, basically, a freak of nature. And there are only 28 people who are five sigma above the mean. And there are only TWO humans in the entire WORLD (over six BILLION people) who are 6 sigma above the mean. (Source is here.)

Notice that four sigma above the mean (i.e., 6’7″ and higher) doesn’t even show up on this graph!

However, in DCPS, according to the testing company that scored the DC-CAS (not according to me!) we have HUNDREDS and HUNDREDS of classes where the AVERAGE number of erasures is FOUR SIGMA above the mean.

There is only one reasonable explanation for this situation.

I will soon post documents from the previous investigations, so you can look at them for yourself. Stay tuned!

Guess who’s scoring those ‘brief constructed responses’?

This was written by one of the scorers. You can find the entire article here.

(It’s called “The Loneliness of the Long-Distance Scorer.”)

“Test-scoring companies make their money by hiring a temporary workforce each spring, people willing to work for low wages (generally $11 to $13 an hour), no benefits, and no hope of long-term employment—not exactly the most attractive conditions for trained and licensed educators. So all it takes to become a test scorer is a bachelor’s degree, a lack of a steady job, and a willingness to throw independent thinking out the window and follow the absurd and ever-changing guidelines set by the test-scoring companies. Some of us scorers are retired teachers, but most are former office workers, former security guards, or former holders of any of the diverse array of jobs previously done by the currently unemployed. When I began working in test scoring three years ago, my first “team leader” was qualified to supervise, not because of his credentials in the field of education, but because he had been a low-level manager at a local Target.

“In the test-scoring centers in which I have worked, located in downtown St. Paul and a Minneapolis suburb, the workforce has been overwhelmingly white—upwards of 90 percent. Meanwhile, in many of the school districts for which these scores matter the most—where officials will determine whether schools will be shut down, or kids will be held back, or teachers fired—the vast majority are students of color. As of 2005, 80 percent of students in the nation’s twenty largest school districts were youth of color. The idea that these cultural barriers do not matter, since we are supposed to be grading all students by the same standard, seems far-fetched, to say the least. Perhaps it would be better to outsource the jobs to India, where the cultural gap might, in some ways, be smaller.

“Many test scorers have been doing this job for years—sometimes a decade or more. Yet these are the ultimate in temporary, seasonal jobs. The Human Resources people who interview and hire you are temps, as are most of the supervisors. In one test-scoring center, even the office space and computers were leased temporarily. Whenever I complained about these things, some coworker would inevitably say, “Hey, it beats working at Subway or McDonald’s.”

“True, but does it inspire confidence to know that, for the people scoring the tests at the center of this nation’s education policy, the alternative is working in fast food? Or to know that, because of our low wages and lack of benefits, many test scorers have to work two jobs—delivering newspapers in the morning, hustling off to cashier or waitress at night, or, if you’re me (and plenty of others like me) heading home to start a second shift of test scoring for another company?

“Company communications with test-scoring employees often feel like they have been lifted from a Kafka novel. Scorers working from home almost never talk to an actual human being. Pearson sends all its communications to home scorers via e-mail, now supplemented by automated phone calls telling you to check your inbox. After the start of a project, even these e-mails cease, and scorers are forced to check the project homepage on their own initiative to find out any important changes. Remarkably, for a company entrusted with assessing students’ educational performance, messages from Pearson contain a disturbing number of misspellings, incorrect dates, typos, and missing information. Pearson’s online video orientation, for example, warns scorers that they may face “civil lawshits” from sexual harassment. Error-free communications are rare. I was considering whether this was a fair assessment, when I received a message from Pearson with the subject “Pearson Fall 2010.” The link in the e-mail took me to a survey to find out my availability—for the spring of 2011.”

A rant concerning education

There is fraud in many, many realms of work and human enterprise. Including lawyers, doctors, businessmen, accountants, engineers, policemen, nurses, painters, taxi drivers, politicians, ‘reformers’, housewives, babies, children, students, the retired, stockholders, hunter-gatherers, soldiers, officers, spies, writers like me… (Sorry if I left out your favorite group; I got tired of typing this list) We are all sometimes crooked, no? Including some teachers.

But I think the problem is deeper. Yes, there is an awful lot of corruption and outright graft in education (as it is in many other areas). But I think that education and upbringing of the next generation is one of the most important things we can do. The last thing we really want is to have gangs of unemployed, disengaged kids hanging on street-corners, engaging in thuggish and criminal behavior, getting locked up for various offenses, engaging in violence and so on … regardless of whether their freaking math and reading test scores were ‘proficient’, ‘advanced’, ‘basic’, or ‘below basic’ – that’s not really important. What’s important is, are they becoming good human beings, or otherwise? And is it the sole job of the classroom teacher to fix all that? I don’t think he or she could if they tried. And, lord knows, they have been trying. And in the past 10 years they have been forced to work harder and harder, to no real human avail nor real improvement.

One could easily make the argument that we don’t spend nearly enough money on education. Heck, every single student should begin learning a foreign language soon after they learn to write their own. Plus, they should get really good coaching in some sort of physical endeavor (not necessarily a sport). Plus, they should all learn to play a musical instrument and to cook good food. And to appreciate good literature, music, and other cultures. And learn how to use various tools (metal, wood, software, and much, much more).

And to learn how society actually does function, and how it SHOULD work, why it works the way it does instead of the way it should, and to try to figure out ways from get from the actual present situation to an improved situation.

We are doing very little of any of this with our most underprivileged young society members. The kids who are raised in our ghettoes very seldom get to learn any of that stuff. Instead, society waits until they do something really, really wrong, and then locks them up. But it’s really, really expensive to keep someone locked up for 30 or 40 years – at about $20,000 per prisoner per year, that’s six hundred thousand to eight hundred thousand dollars ($600,000 to $800,000) per prisoner. It would have been a lot cheaper in the long run to invest in after-school programs to seriously engage students in sports, music, and much, much more, including lots of field trips to museums, zoos, mountains, beaches, factories, farms, and much, much more.

Instead, we are narrowing our educational goals more and more onto things that really don’t matter very much at all. (Have you actually LOOKED at the inane questions they ask on these dinky standardized NCLB tests? They were written by people who have absolutely no experience in the real world, or chose to ignore everything they ever learned about it.)

Whether DC-CAS scores go up or down at any school seems mostly to be random!

After reviewing the changes in math and reading scores at all DC public schools for 2006 through 2009, I have come to the conclusion that the year-to-year school-wide changes in those scores are essentially random. That is to say, any growth (or slippage) from one year to the next is not very likely to be repeated the next year.

Actually, it’s even worse than that.The record shows that any change from year 1 to year 2 is somewhat NEGATIVELY correlated to the changes between year 2 and year 3. That is, if there is growth from year 1 to year 2, then, it is a bit more likely than not that there will be a shrinkage between year 2 and year 3.  Or, if the scores got worse from year 1 to year 2, they there is a slightly better-than-even chance that the scores will improve the following year.

And it doesn’t seem to matter whether the same principal is kept during all three years, or whether the principals are replaced one or more times over the three-year period.

In other words, all this shuffling of principals (and teachers) and turning the entire school year into preparation for the DC-CAS seems to be futile. EVEN IF YOU BELIEVE THAT THE SOLE PURPOSE OF EDUCATION IS TO PRODUCE HIGH STANDARDIZED TEST SCORES. (Which I don’t.)

Don’t believe me? I have prepared some scatterplots, below, and you can see the raw data here as a Google Doc.

My first graph is a scatterplot relating the changes in percentages of students scoring ‘proficient’ or better on the reading tests from Spring 2006 to Spring 2007 on the x-axis, with changes in percentages of students scoring ‘proficient’ or better in reading from ’07 to ’08 on the y-axis, at DC Public Schools that kept the same principals for 2005 through 2008.

If there were a positive correlation between the two time intervals in question, then the scores would cluster mostly in the first and third quadrants. And that would mean that if scores grew from ’06 to ’07 then they also grew from ’07 to ’08; or if they went down from ’06 to ’07, then they also declined from ’07 to ’08.

But that’s not what happened. In fact, in the 3rd quadrant, I only see one school – apparently  M.C.Terrell – where the scores went down during both intervals. However, there are about as many schools in the second quadrant as in the first quadrant. Being in the second quadrant means that the scores declined from ’06 to ’07 but then rose from ’07 to ’08. And there appear to be about 7 schools in the fourth quadrant. Those are schools where the scores rose from ’06 to ’07 but then declined from ’07 to ’08.

I asked Excel to calculate a regression line of best fit between the two sets of data, and it produced the line that you see, slanted downwards to the right. Notice that R-squared is 0.1998, which is rather weak. If we look at R, the square root of R-squared, that’s the regression constant, my calculator gives me -0.447, which means again that the correlation between the growth (or decline) from ’06 to ’07 is negatively correlated to the growth (or decline) from ’07 to ’08 – but not in a strong manner.

OK. Well, how about during years ’07-’08-’09? Maybe Michelle Rhee was better at picking winners and losers than former Superintendent Janey? Let’s take a look at schools where she allowed the same principal to stay in place for ’07, ’08, and ’09:

Actually, this graph looks worse! There are nearly twice as many schools in quadrant four as in quadrant one! That means that there are lots of schools where reading scores went up between ’07 and ’08, but DECLINED from ’08 to ’09; but many fewer schools where the scores went up both years. In the second quadrant, I  see about four schools where the scores declined from ’07 to ’08 but then went up between ’08 and ’09. Excel again provided a linear regression line of best fit, and again, the line slants down and to the right. R-squared is 0.1575, which is low. R itself is about -0.397, which is, again, rather low.

OK, what about schools where a principal got replaced? If you believe that all veteran administrators are bad and need to be replaced with new ones with limited or no experience, you might expect to see negative correlations, but with positive overall outcomes; in other words, the scores should cluster in the second quadrant. Let’s see if that’s true. First, reading changes over the period 2006-’07-’08:

Although there are schools in the second quadrant, there are also a lot in the first quadrant, and I also see more schools in quadrants 3 and 4 than we’ve seen in the first two graphs. According to Excel, R-squared is extremely low: 0.0504, which means that R is about -0.224, which means, essentially, that it is almost impossible to predict what the changes would be from one year to the next.

Well, how about the period ’07-’08-’09? Maybe Rhee did a better job of changing principals then? Let’s see:

Nope. Once again, it looks like there are as many schools in quadrant 4 as in quadrant 1, and considerably fewer in quadrant 2. (To refresh your memory: if a school is in quadrant 2, then the scores went down from ’07 to ’08, but increased from ’08 to ’09. That would represent a successful ‘bet’ by the superintendent or chancellor. However, if a school is in quadrant 4, that means that reading scores went up from ’07 to ’08, but went DOWN from ’08 to ’09; that would represent a losing ‘bet’ by the person in charge.) Once again, the line of regression slants down and to the right.  The value of R-squared, 0.3115, is higher than in any previous scatterplot (I get R = -0.558) which is not a good sign if you believe that superintendents and chancellors can read the future.

Perhaps things are more predictable with mathematics scores? Let’s take a look. First, changes in math scores during ’06-’07-’08 at schools that kept the same principal all 3 years:

Doesn’t look all that different from our first Reading graph, does it? Now, math score changes during ’07-’08-’09, schools with the same principal all 3 years:

Again, a weak negative correlation. OK, what about schools where the principals changed at least once? First look at ’06-’07-‘-8:

And how about ’07-’08-’09 for schools with at least one principal change?

Again, a very weak negative correlation, with plenty of ‘losing bets’.

Notice that every single one of these graphs presented a weak negative correlation, with plenty of what I am calling “losing bets” – by which I mean cases where the scores went up from the first year to the second, but then went down from the second year to the third.

OK. Perhaps it’s not enough to change principals once every 3 or 4 years. Perhaps it’s best to do it every year or two? (Anybody who has actually been in a school knows that when the principal gets replaced frequently, then it’s generally a very bad sign. But let’s leave common sense aside for a moment.) Here we have scatterplots showing what the situation was, in reading and math, from ’07 through ’09, at schools that had 2 or more principal changes from ’06 to ’09:


This conclusion is not going to win me lots of friends among those who want to use “data-based” methods of deciding whether teachers or administrators keep their jobs, or how much they get paid. But facts are facts.


A little bit of mathematical background on statistics:

Statisticians say that two quantities (let’s call them A and B) are positively correlated when an increase in one quantity (A)  is linked to an increase in the other quantity (B). An example might be a person’s height(for quantity A) and length of a person’s foot (for quantity B). Generally, the taller you are, the longer your feet are. Yes, there are exceptions, so these two things don’t have a perfect correlation, but the connection is pretty strong.

If two things are negatively correlated, that means that when one quantity (A) increases, then the other quantity (B) decreases. An example would be the speed of a runner versus the time it takes to run a given distance.  The higher the speed at which the athlete runs, the less time it takes to finish the race. And if you run at a lower speed, then it takes you more time to finish.

And, of course, there are things that have no correlation to speak of.

Published in: on March 13, 2010 at 3:37 pm  Comments (2)  
Tags: , , , , ,

Wall Street Likes Michelle Rhee. I don’t. You shouldn’t, either.

Today I read an editorial in the Wall Street Journal that said Obama and Duncan aren’t giving Michelle Rhee enough support. That was bad enough, but then I read the comments from the readers. They were downright scary; they sounded like Mussolini-style fascists. Here are some comments I will try to post:

I find it most ironic that a number of readers of the Wall Street Journal are calling public school teachers “selfish” because teachers are overwhelmingly against Rhee-type diktats, which mostly involve losing any form of due process for teachers, and  because teachers are not all that interested in potentially earning 6-figure salaries. If teachers were really interested in making lots of money, they would go work on Wall Street, or else become right-wing educational experts like Michelle Rhee. (Duh!)

It might sound old-fashioned, but most teachers would much rather have the satisfaction of knowing that they had actually taught their students well. They  would also prefer to hear kind words from their ex-students (and the parents and administrators) for their efforts, rather than spending their time on mindless test-prep activities and on competing with each other for gimmicks that might raise test scores to earn big bonuses. They also believe in fairness and telling the truth, neither of which are virtues exhibited by the current DCPS schools chief.

The public at large has vivid, recent memories of what a single-minded concentration on a single bottom-line number – like a standardized test score – might lead to. Wall Street traders, large banks, mortgage bankers, Enron and all the rest cheated and lied to produce a good, but fictional, bottom line, so they could get those big bonuses. And they nearly crashed the whole world economy along the way. Unfortunately, some school administrators and teachers have already been caught cheating on student test scores; others have probably gotten away with it. This is not the purpose for which most teachers went into the profession.

Let me point out a few other things:

  1. To improve education in a school system, you need good teachers, and you need a good curriculum. Michelle Rhee has been in power for 2.5 years now, and hasn’t done boo about the curriculum – which needs a lot of work.
  2. Rhee has fired a lot of principals and teachers, which might make most readers of the Wall Street Journal happy, but anybody who claims that the schools where she replaced principals are doing better is lying. See earlier blog entries for statistical details on this.
  3. Test scores in DC public schools, both on the DC-CAS and the NAEP, have been rising steadily for many years. In the case of the NAEP, it’s been since the mid-1990’s. Rhee had no hand in any of that improvement, but she’s been trying to take the credit. Again, my blog has details.
  4. Charter schools do absolutely no better than regular public schools, though in DC they get much more money per pupil and are helping to re-segregate our school system. CHARTER SCHOOLS ARE A FAILED EXPERIMENT! GIVE IT UP! FIX THE PUBLIC SCHOOLS!
  5. One very troubling new symptom of Rhee’s tenure in DCPS over the past TWO years is that the gap between kids at the 10th and the 90th percentiles, or between the 25th and 75th percentiles, or between the poor and the non-poor, or between white students and black students, have suddenly started to grow a lot, as shown on the NAEP. I’m not sure exactly why this is or how it’s happened, but I do know that it’s not happening in the nation as a whole or in any other big city. The only major change is Rhee and her single-minded insistence on control without accountability, elimination of due process for teachers, and substituting test prep for teaching. See Bill Turque’s excellent article in the Washington Post on 12/13/09.
  6. If Mr. Obama continues praising Michelle Rhee in particular, and charter schools in general, it will be an extremely sad event for public education in America. I hope he comes to his senses, and replaces Arne Duncan, too.
Published in: on December 15, 2009 at 3:25 am  Comments (3)  
Tags: , , ,
%d bloggers like this: