Michael Martin on Standardized Testing

A recent post by Michael Martin (and I hope he forgives me):

I think the fallacy here is to conflate pretest/posttest comparisons in specific subject areas with broad high-stakes tests that are removed in time from the instruction. There is always a problem with using multiple choice tests to measure anything other than what is termed “inert knowledge” but like any measuring instrument it makes a big difference on how it is used.

As UCLA Professor James Popham, an expert in testing and former president of the American Educational Research Association, wrote in a March, 1999, essay titled “Why Standardized Tests Don’t Measure Educational Quality” published in Educational Leadership:

“Employing standardized achievement tests to ascertain educational quality is like measuring temperature with a tablespoon. Tablespoons have a different measurement mission than indicating how hot or cold something is.”

There are numerous valid uses of professional testing. In the experience I have, there is very little appreciation for how difficult it is to actually develop a valid question to measure what you want to measure. I actually took a survey methodology course in college where each week we designed and administered a small survey with the restriction that we had to ask each question in two different ways to see if the responses matched up. What I learned was that we were never able to reliably design a survey that gave similar results from the two questions process.

People have an unwarranted trust in tests much like they have in lie detector tests. Experts in both fields explain that they don’t do what people think they do.

Harvard University Professor Daniel Koretz, a national expert on achievement testing, writes in his 2008 book “Measuring Up” that there is “a single principle” that should guide the use of tests, “don’t treat ‘her score on the test’ as a synonym for ‘what she has learned.’”

In a May 8, 2005, news story, a Cox News Service reporter interviewed experts on standardized testing and reported that at “the Lindquist Center – located on the University of Iowa campus and named for the grandfather of standardized testing – you won’t find a lot of fans of No Child Left Behind.” The story, titled “U.S. testing craze worries experts behind the scores,” explained “the consensus is that standardized tests weren’t created for such a sweeping, high-stakes purpose” and continued:

“That’s the position of our entire field,” said Steve Dunbar, head of Iowa Testing Programs, developer of the Iowa Test of Basic Skills. … Experts in the Lindquist Center … expect the No Child Left Behind to run its course, confident the politically driven pendulum will swing back to a more reasonable view of the value of testing. Dunbar predicts public support will wane because of results that don’t seem to make sense. “The tests,” Dunbar said, “will lose credibility.”

One of the foremost experts on academic testing in the world, Professor Robert Linn, wrote in a 1998 technical paper for the Center for the Study of Evaluation:

“As someone who has spent his entire career doing research, writing, and thinking about educational testing and assessment issues, I would like to conclude by summarizing a compelling case showing that the major uses of tests for student and school accountability during the past 50 years have improved education and student learning in dramatic ways. Unfortunately, that is not my conclusion. Instead I am led to conclude that in most cases the instruments and technology have not been up to the demands that have been placed on them by high-stakes accountability.”

Economists are simply ignorant of the reality in testing. They think that all numbers are accurate measures. I have a background in which after I had graduated from college and worked in the field for several years I enrolled and then dropped out of a masters in economics program because I could not believe how naïve the instructors were. I took several classes and in each one I would put a vertical line on my notebook and write what they told me on the left of the line and write what was actually true from my knowledge on the right of the line. The back breaker was a course in international trade in which most of the semester was spent teaching the Hecksher-Ohlin theory and in the last two weeks they revealed the Leontief Paradox in which economist Wassily Leontief had tested the Hecksher-Ohlin theory and found it was wrong. Leontief later received the Nobel Prize. They spend an entire semester teaching student economists a theory that had already been proven wrong.

I also should point out that at Arizona State University where I took these courses, and presumably in others, they offered a B.A. and a B.S. in economics, where were unfortunately named in reverse. I had to work with people who graduated with a B.A. in economics and what they learned was mostly BS that required to mathematical training. The B.S. in economics had an entirely different curriculum involving mathematics and they were even disparagingly called “quants” by the B.A. people for their quantitative approaches. It was a bizarre Alice in Wonderland world. The only saving grace is that the one professor I actually respected was later made chairman of the department.

So it doesn’t surprise me at all that the most foolish reports about using data in education come from economists. A lot of them are way out of their depth. On the other hand, one of the most influential studies that I’ve seen regarding education was a study done back around 1979 by the Philadelphia Federal Reserve in conjunction with the school board to use statistics to associate what education variables were associated with gains in fourth grade reading scores. The Federal Reserve economists regularly used econometric models to work with economic data so they were experts in using quantitative methods. What impressed me was that they found little correlation between test score gains and whether the teacher had a background in reading instruction, but a strong correlation with whether the principal had a background in read[ing] instruction. Something to think about when you consider value added over forty years later.

Michael T. Martin
Research Analyst
Arizona School Boards Association
2100 N. Central Ave, Suite 200
Phoenix, Az 85004


The URI to TrackBack this entry is: https://gfbrandenburg.wordpress.com/2011/02/09/michael-martin-on-standardized-testing/trackback/

RSS feed for comments on this post.

One CommentLeave a comment

  1. Popham is completely dishonest if he’s claiming that today’s standardized tests don’t exhibit reliability — they’re exhaustively examined to make sure of that. Moreover, his paragraph confuses the concepts of “validity” (whether the test measures the right thing) with “reliability” (whether students tend to have stable scores from one test to another). These are very basic concepts, and Popham doesn’t seem to know what he’s talking about.

    It would be like reading a physicist who said, “We don’t know how to measure gravity very well, because it’s really hard to figure out electromagnetism.” That’s an expert physicist? Um, no.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: