On this blog I have reprinted examples of what I see are crappy test items and dissected them, hoping to show readers that those items neither made sense nor measured what they are purported to measure.
However, I never worked inside the testing industry itself, so I don’t have direct experience of making up BS test items on an industrial scale.* My own experience, however, is that EVERY test — no matter how good — has validity and reliability problems. This passage shows that the tests on which all US educational decisions are supposed to be based are, in fact, ridiculously badly made from the beginning, and cannot possibly measure what they pretend to measure, are unreliable, and thus utterly invalid. (Plus the tests are snatching at least potentially valuable class time away from our students, while enabling a handful of big corporations like Pearson (more on which below) are raking in huge dividends because they control almost the entire education market.)
This comes from an interview published by Diane Ravitch ( http://dianeravitch.net/2012/12/27/11990/ )
Rebecca Rubenstein: Since your book was published in 2009, has the “standardized” testing industry improved?
Todd Farley: Not the slightest bit. There was a story in The New York Times in 2001 about how test-scoring was a wildly out-of-control industry, which quotes various employees—not me!—as saying that they faced “too little time, too much to do, not enough people.” It implies the industry was doing a terribly suspect job. Since then, the industry is about a hundred times bigger, but those problems mentioned in the Times article or in my book have never been addressed. The industry has simply grown exponentially, and there are hundreds of millions of dollars to be earned by companies that are completely unregulated—to repeat, completely unregulated, so whatever Pearson et. al. tell us, we’re supposed to say “thank you very much” and just write them a staggeringly large check—but of course things haven’t gotten any better.
In my time in test-scoring, we never had enough temporary employees to do the work; we always had too much to do and too little time to do it; and there were always financial punishments looming over our heads if we didn’t get things done. We cut whatever corners we could to get it done (I’m sorry to say). Today the work load is a hundred times bigger and the money to be made is a hundred times bigger, but the system didn’t work to begin with and of course it doesn’t work now.
The same is true in the test development business. When I worked for one publisher as a test developer, it was always a madcap race to get tests written on time, and we faced absurd deadlines and pressure to do so. The reality is that quality was always secondary to the bottom line when developing tests, and then when the Common Core standards were introduced, and tests and products needed to be written for them, our deadlines became laughably absurd; I was once involved in the development of 200 tests in two months, which I think is literally more tests than ETS has produced in its entire existence. With the Common Core standards released, all the companies knew all the other companies were racing to finish their tests and products first, so quality became even worse than secondary. It became tertiary, or “fourthiary,” or whatever. Subcontractors who had been fired for poor work were rehired; item writers were hired off Craigslist; test developers with neither teaching experience nor test development experience were given full-time jobs. It’s important to remember that at the end of the day, companies like Pearson are for-profit enterprises. They want to make money. They want to make money, so of course they do a crappy job, because the quality of the work is never anywhere near as important as their desire to make a profit, and there’s always too much work and too little time to do it.
A comment: I was at first skeptical of the “200 tests” mentioned being more that the ETS has created in its entire existence. But I think he may be right: The SAT is essentially one, or two, or three tests, depending on how you look at it; it just gets revised a little bit each year. Reading, Math, and Writing. Plus, there perhaps a couple of score different Advanced Placement (AP) tests and Achievement tests in different subjects; they get revised every year, at least they do in the field of math (which I follow, of course) and others.
But what Pearson is doing now is essentially trying to replace the teacher in every single grade level, for every single course, by making the entire curriculum driven by the tests and pre-tests and practice tests and test prep material provided by them. Yes, I do mean all of third grade. Yes, I do mean 6th grade science, music appreciation, and geography and PE. Every class. And if you count every single course or subject area that a student might be measured by from Pre-K-3 all the way up to graduating from high school, that might in fact be roughly 200 brand-new test series! Not just end-of-course tests, by no means. A different corporate multiple-choice test every month or two!
All this corporate educa-crap is just that: crap forced down the throat of public school kids and ONLY kids in public schools.
And it won’t improve a damned thing. Except for corporate bottom lines.
Of course the children or grandchildren of Michelle Rhee, Michael Bloomberg, Arne Duncan, Eli Broad, Bill Gates, the Koch brothers, and Barack Obama will never, ever be subjected to such a poor excuse for an education.
That’s just for the poor black and latino and white kids who are in high-poverty regions; the only way they can opt out is to go to a charter school which might be doing any damned thing and is almost sure to be even more segregated than the nearest public school, if that’s even possible.
This is progress?
* My students and I often found mistakes on tests and quizzes and assignments I made up. I used to congratulate the student and give him/her/them a point when they pointed out an error. ETS and Pearson’s responses have been rather different. Remember the famous talking pineapple question? And do you recall that essentially no-one has ever been able to explain, line by line, number by number, exactly how ANY single teacher’s VAM numbers were calculated? Has any school district ever released data showing how well VAM and supposedly ‘scientific’ classroom observation data correlate with each other? (Hint: they don’t!!)
Once again, let me urge the leadership of the Washington Teachers’ Union, and teacher unions elsewhere, to enlist a good statistician with his/her feet on the ground, and poke holes in VAM. It’s all a tissue of fabrications.