More From Steve Rasmussen on Shoddy Test Items on the SBCC Samples

I am quoting extensively from Steve Rasmussen’s long and well-crafted critique of the released practice Common Core test items, because I think it’s very eye-opening and not enough people are reading it. Here is the money quote:

“…the Smarter Balanced tests are lemons. They fail to meet acceptable standards of quality and performance, especially with respect to their technology-enhanced items. They should be withdrawn from the market before they precipitate a national catastrophe.”

Here is some of the rest of his critique:

Flaws in the Smarter Balanced Test Items

What happened? Despite elaborate evidence-centered design frameworks touted by Smarter Balanced as our assurance that their tests would measure up, the implementation of the tests is egregiously flawed. I wish I could say the flaws in the Smarter Balanced tests are isolated. Unfortunately, they are not. While the shortcomings are omnipresent and varied, they fall into categories, all illustrated multiple times by the examples in this critique:

• Poorly worded and ambiguous mathematical language and non-mathematical instructions;

• Incorrect and unconventional mathematical graphical representations;

• Inconsistent mathematical representations and user interfaces from problem to problem;

• Shoddy and illogical user interface design, especially with respect to the dynamic aspects of the mathematical representations; • Consistent violations and lack of attention to the Common Core State Standards;

• Failure to take advantage of available technologies in problem design.

As you’ll see as you look at these test items with me.

The result? Untold numbers of students and teachers in 17 Smarter Balanced states will be traumatized, stigmatized and unfairly penalized. And the quagmire of poor technological design, poor interaction design, and poor mathematics will hopelessly cloud the insights the tests might have given us into students’ understanding of mathematics.

Technology-enhanced items could have made use of widely ratified and highly developed technologies (e.g., graphing calculators, dynamic geometry and data analysis tools) to engage students in substantive tasks. Instead, these tests rely on a small number of pedestrian and illogical interface “widgets”(arrays of checkboxes, crude drawing tools, graphical keypad, drag-and-drop digit pilers, etc.) that the test item writers used via question templates.The widgets often provide window dressing for multiple-choice questions.

Spending $330 million of federal spending could have funded real innovation—or at least deployment of the best technologies available for these tests. The public at large—students, parents, educators, policy makers—who see these poor and dated uses of technology may incorrectly conclude that technology can not significantly improve mathematics instruction. These tests give educational technology a bad name.

Soon after I circulated the first version of this critique, Elizabeth Willoughby, a fifth grade teacher in Clinton Township, MI, sent me the following note:

After reading your piece covering the flaws you found on the Smarter Balanced assessment, I had to reach out and thank you. I teach fifth grade. I put my students on the math test, made a video and sent it to Smarter Balanced. My students are on computers almost every day—they are tech savvy. The video is worth a watch:

I watched Ms. Willoughby’s video. You should, too. Her “tech savvy” kids are as confused by the test interface as I was. The video vividly demonstrates that even these very capable students will get stuck on the Smarter Balanced tests as a result of the shoddy interface. Ms. Willoughby also shared with me her email exchange with the Smarter Balanced Help Desk on the subject of her students’ problems. The email below is part of this exchange and occurred in March 2014:

…Reading below, you will see my students took the practice test and had many issues with the student interface. Smarter Balanced, in reply, sent me a series of confusing emails filled with half-information regarding access to TIDE and field tests which supposedly has updated tests with cleaner, easier to operate user interface tools… I would greatly appreciate an answer to a simple question:

Your email below acknowledges the issues with the student interface tools found on the practice tests. Your email also indicates you found the same issues in the recent field test. Your email below clearly indicates you will make changes to the practice test to address these issues…Can I get a general timeline as to when the update to the practice test will occur and will the practice test reflect all of the student interface skills students will need to perform tasks on the actual test?

As these skills are unique to your assessment (not found in other programs, apps, etc.), your practice test needs to provide those practice opportunities. I don’t mean to press, however, these ARE high stakes tests. I need to be prepared and I need to prepare my students for success on these tests, which includes providing them with the ability to use the assessment with success. Thanks, Elizabeth L. Willoughby

Ms. Willoughby received no satisfactory reply. Despite vague assurances the iterative rounds of field tests would address her students’ frustrations with the interface, we see that nothing has improved by the launch of the actual tests. CTB created nearly 10,000 test items for Smarter Balanced. If half of these are for mathematics, there are almost 5,000 items already deposited in the mathematics item bank. Bad items will surface on tests for years to come.

Liana Heitin, in a September 23, 2014, article in Education Week, “Will Common-Core Testing Platforms Impede Math Tasks?” wrote:

Some experts contend that forcing students to write a solution doesn’t match the expectations of the common-core math standards, which ask students to model mathematics using diagrams, graphs, and flowcharts, among other means. “It’s not like, during the year in classrooms, these kids are solving these problems on the computer,” said David Foster, the executive director of the Morgan Hill, Calif.- based Silicon Valley Mathematics Initiative, which provides professional development for math teachers, creates assessments, and has worked with both consortia. “It’s such an artificial idea that now it’s test time, so you have to solve these problems on computers.” Mr. Foster, who has authored problems for the new Common Core tests, goes on in the article to say: “I’m a mathematician, and I never solve problems by merely sitting at the keyboard. I have to take out paper and pencil and sketch and doodle and tinker around and draw charts,” he said. “Of course, I use spreadsheets all the time, but I don’t even start a spreadsheet until I know what I want to put in the cells. “All Smarter Balanced and PARCC are going to look at is the final explanation that is written down,” he said, “and if there’s a flaw in the logic, there’s no way to award kids for the work they really did and thought about.” Mr. Foster added: “I’ve played with the platform, and it makes me sick. And I’ve done it with problems I’ve written.”

Further along we hear the same sentiment from another expert:

 But, as James W. Pellegrino, a professor of education at the University of IllinoisChicago who serves on the technical-advisory committees of both consortia, points out, students can solve a single problem in any number of ways, not all of which are easy to explain in words. “The worry is [the platform] narrows the scope of what students can do, and the evidence they can provide about what they understand,” he said. “That leads to questions about the validity of the inferences you can make about whether students really developed the knowledge and skills that are part of the common core.”

In a post to the Illinois Council of Teachers of Mathematics listserv in July 2014, Martin Gartzman, Executive Director of the Center for Elementary Mathematics and Science Education at the University of Chicago, took specific aim at the shortcomings of the PARCC tests, but also stated that his criticisms applied equally to Smarter Balanced:

I understand that creating a large-scale assessment, such as the PARCC assessment, is an incredibly complex task that involves many decisions and many compromises. However, I assert that we are being far too generous about PARCC’s decision regarding the ways that students can enter their responses to open-response, handscored items. By accepting that decision, we are essentially endorsing an assessment system that, by design, does not give students a fair shot at showing what they know about mathematics, and that we know will underrepresent what Illinois students understand about the mathematics addressed in the CCSS-M.

This is not an issue of students needing to get used to the PARCC formats. The problem is that the test format itself is mathematically inadequate. The extensive PARCC field test definitively affirmed that the limited tools available to students (keyboard and equation editor) for entering their responses made it extremely difficult for many students to demonstrate what they knew about the CCSS-M content and practices.

While the experts cited here are highly critical, I think the actual situation with the new tests is even more disastrous than they describe. The tests suffer from the problems they describe and the issues go far beyond the limitations imposed by computer keyboards and equation editors. The appalling craft displayed in these tests compounds the problems that even well-conceived computer-based mathematics tests would have to overcome to effectively assess students.

In July 2012, Measured Progress, a contractor to Smarter Balanced, warned in Smarter Balanced Quality Assurance Approach Recommendation for the Smarter Balanced Assessment Consortium:

In this industry and with a system of this highly visible nature, the effects of software that has not been sufficiently tested can lead to an array of problems during a test administration that can be financially and politically expensive.

 Interestingly, my online review of the Smarter Balanced proposals and contract documents finds little evidence of attention to quality assurance at the level of “widget” or item development. There are vague statements about item review processes, but few specifics. There is a tacit assumption that the companies that develop high-stakes tests know how to develop mathematical test items and will do it well and that they are capable of performing their own quality assurance. Those of us in the education industry know better.

Unfortunately, the Smarter Balanced tests are lemons. They fail to meet acceptable standards of quality and performance, especially with respect to their technology-enhanced items. They should be withdrawn from the market before they precipitate a national catastrophe.

We know, however, that this won’t happen. Test season has already started. Kids Deserve Better Struggling students will likely be penalized more than proficient students on the Smarter Balanced tests as the cognitive load of grappling with poorly designed interfaces and interactive elements will raise already high levels of test anxiety to even more distracting levels. Those who attempt to mine the test results for educational insight—teachers, administrators, parents, researchers, policy makers—will be unable to discern the extent to which poor results are a reflection of students’ misunderstandings or a reflection of students’ inability to express themselves due to difficulties using a computer keyboard or navigating poorly constructed questions and inadequate interactive design.

Time spent prepping for these tests using the practice and training tests and learning how to use the arcane test tools like the “Equation Response Editor tool” is educational time squandered. Many schools have scheduled inordinate numbers of days just for this test prep, but using the tools offered by Smarter Balanced will lead to none of the educational outcomes promised to support CCSSM.

These tools are not learning tools that lead to mathematical insight, they’re highly contrived force-a-square-peg-into-a-round-hole test-specific tools. If widespread testing is going to be a reality in schools, and if schools are going to deploy scarce resources to support computer-based tests, then it is essential that tests successfully assess students and contribute more generally to the improvement of the quality of education.

There is no good reason for the tests to be this bad. The past forty years of extraordinary progress in research-directed development of mathematics visualization and technology for expressing mathematical reasoning could be put to use to power these tests—elegantly and effectively. As an example of computer-based assessment pursuing a vastly higher quality standard than that achieved by Smarter Balanced and CTB, look at the December 15 and January 5, 12, and 19 blog posts at Sine of the Times (, which describe work we did some years ago at KCP Technologies and Key Curriculum Press.

Others in the mathematics education community—researchers and practitioners—know how to do quality work.

“Déjà Vu All Over Again”

The results of the Smarter Balanced tests for 2014–2015, when they come, will further confuse the national debate about Common Core and contribute significantly to its demise. Because the general public has no reason to believe that these results do not accurately reflect mathematics education in this country, they will not realize that the poor performance of students on these tests is due, in significant part, to the poor craft of the test makers. When poor results make headlines, will anyone point the finger in the direction of the test makers? Likely not. Students and frontline educators at all levels will be attacked as incompetent—but the incompetent test makers will get a free pass. I’ve seen this before.

Twenty-five years ago, Creative Publications, a California publisher, developed MathLand, an innovative elementary mathematics program. These Critique of Smarter Balanced Common Core Tests for Mathematics, SR Education Associates materials were rated as “promising” by a U.S. Education Department panel. But Creative Publications had rushed the materials to market for the 1992 state of California adoption and MathLand was not ready for “prime time.” However “promising,” it was poorly crafted—ideas were not fully developed, there had been little or no field-testing, little revision of the original manuscript, and there had been no application of the iterative principles of product engineering.

Even so, a majority of California school districts adopted MathLand. Why? Because it was a promising idea and the craft issues with MathLand were invisible to an untrained eye. And there was no pilot period during which schools and districts could properly vet the materials within the California adoption timeline and reject them if they proved lacking. Whatever one’s position on the underlying educational principles of MathLand, the materials did not work well in classrooms—but no one found this out this until too late. Completely lost in the public uproar over MathLand was the distinction between good ideas and poor craft. As soon as they were able, California districts abandoned MathLand.

Creative Publications disappeared.

In California, the MathLand fiasco discredited California’s 1992 Mathematics Framework and significantly contributed to the launch of the national “Math Wars.”

It has taken 20 years to undo the damage these poorly crafted materials did to our mathematics education community. The Common Core State Standards for Mathematics and the high-stakes tests under development by Smarter Balanced and PARCC are not one and the same. However, in the public eye, and particularly in the crosshairs of Common Core political opponents, Common Core and the Smarter Balanced/PARCC high-stakes tests funded by the federal government are two sides of the same coin.

As Diane Briars, National Council of Teachers of Mathematics (NCTM) President, pointed out in “Core Truths,” a July 2014 “President’s Corner” message: “Particularly problematic is a tendency to equate CCSSM with testing and with test-related activities and practices.”

While certainly not perfect, the Common Core State Standards for Mathematics are a step forward, especially because of the prominence of the Standards for Mathematical Practice. I believe that CCSSM should continue to receive full support and that it should evolve and improve based on the experiences of practicing teachers, mathematics professionals, mathematics educators, parents, students, and a wide range of other stakeholders. In high-performing countries like Singapore and South Korea, national curricula are revised and improved on a regular schedule. South Korea, for instance, has revised its national curricula once every 5 to 7 years and is now using the 7th iteration of the curriculum. 34 However, the appalling Smarter Balanced high-stakes tests could well be the death of the national effort to improve mathematics instruction via Common Core—before we ever get to iteration 2. That would be tragic.

Published in: on March 18, 2015 at 8:05 pm  Comments (4)  

The URI to TrackBack this entry is:

RSS feed for comments on this post.

4 CommentsLeave a comment

  1. […] Brandenburg, a retired math teacher and outstanding blogger,here revisits Steven Rasmussen’s critique of the Smarter Balanced Assessment Consortium’s math tests. Rasmussen was co-founder […]


  2. You say: However, the appalling Smarter Balanced high-stakes tests could well be the death of the national effort to improve mathematics instruction via Common Core—before we ever get to iteration 2. That would be tragic.

    Not a math teacher, but given all that is wrong with these tests why are you eager to imply that the CCSS are a great way to improve math instruction?
    Do you really think the CCSS represent the best way? If so why?
    Have you or anyone you know seen the curriculum materials that each consortium asked USDE to pay for so they could make tests? What do you think your curriculum money bought– at least $30 million divided by 2 test development operations, perhaps divided again by 2 subjects. Are you aware that that piece of the CCSS implementation is illegal, prohibited by federal law?


    • I’m quoting Steve Rasmussen here – almost none of that post was my own wording.
      He is the one stating that the CCSS in math is better than what we had before in most states, and I think he’s probably right. It’s not perfect, nothing is, but what we had here in Washington DC was pretty lame in my opinion.
      I am certainly aware of the tremendous pressure (both federal and corporate) on each state to force them to approve the entire package – standards and testing-all-the-time and cash to corporations and closing schools and firing teachers and turning schools over to private operators. If the standards had been complete crap, then no one would have accepted the rest of the poison too…


  3. We need to focus on the poor quality of the CC math standards themselves. Read the actual text of the standards – an obvious rush job by people not experienced in writing standards. A step forward? Which way is forward? Are we in wonderland?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: