Why the Smarter Balanced Common Core Math Test is Fatally Flawed

This spring, tests developed by the Smarter Balanced Assessment Consortium will be administered to well over 10 million students in 17 states to determine their proficiency on the Common Core Standards for Mathematics. If the results are disappointing, who should shoulder the blame—the students, their teachers, the administrators, the standards? Based on my analysis of the available test items, the real culprit will be the tests themselves.

In 2010, I, like many educators, was hopeful that $330 million of tax dollars from the U.S. Education Department and the pooled resources of state governments would produce a new generation of Common Core mathematics and language arts standardized tests that would be better than traditional paper-and-pencil multiple-choice tests.

Smarter Balanced, one of the two contractors the U.S. Department of Education selected to develop the tests, promised technology-enhanced tests that made smart use of digital tools for mathematics to more deeply assess student knowledge. Smarter Balanced, with an award of $176 million, vowed to “create innovative and real-world item types that rely on technology platforms.” Smarter Balanced, in turn, contracted with CTB/McGraw Hill, a traditional test making company, to make their “innovative” tests.

Given my background as a publisher of mathematics curriculum and software, I was keenly interested to see whether the vision of better tests had been fulfilled. I took a close look at the Smarter Balanced practice and training tests available online. What I found shocked me: a quagmire of poor technological design, poor interaction design, and poor mathematics that hopelessly clouds the insights the tests might give us into students’ thinking.

Below are just a few of my findings. You’ll find the actual sample questions along with analysis in my report at www.mathedconsulting.com.

A question involving fuel consumption in cars is contrived so that answers to division problems with mixed numbers turn out unexpectedly to be whole numbers. Why? So that the test makers can ask students to drag little cars to a number line that only allows cars at whole numbers. If a student mistakenly obtains a non-whole number answer and tries to drag a car to that location, the car will “snap” to the nearest whole number when the student lets go. This might be good for students—changing an incorrect answer to a potentially correct one—but it’s useless in determining what they do and don’t know.

Given a geometry problem involving circles on a coordinate plane, students are asked to “show” their work. The obvious way to approach this problem is to start with a drawing. But no drawing tools are offered with this “technology-enhanced” question. Students must type their answers and “work” in prose with the computer keyboard. What is Common Core Mathematical Practice Standard 5? “ Use appropriate tools strategically.” Not on this question!

Mathematical Practice Standard 6 states that “mathematically proficient students try to communicate precisely to others.” It’s too bad the Smarter Balanced tests do not heed this Common Core call. A sample assessment item asks students to draw a rhombus that is also a rectangle, using a “Connect Line” tool. A rhombus, however, is comprised of segments, not lines. And what is a “Connect Line” tool, anyway? Weird name! Does it connect lines? Does the tool draw lines? Actually, it doesn’t do either—it draws segments. Who knew?

There is no reason for requiring computers for these tests. Not one of the practice and training test items is improved through the use of technology. The test is not “smarter” as its name implies. The items do not probe deeper than a paper-an-pencil test can. The primitive software used only makes it more difficult for students and reduces the reliability of the resulting scores.

If the released items on the tests are indicative of the quality of the actual tests—and Smarter Balanced tells us they are—their shoddy craft will directly and significantly contribute to students’ poor and inaccurate scores. The result? Untold numbers of students and teachers will be traumatized, stigmatized, and unfairly penalized. And sadly, struggling students will likely be penalized more than proficient students as the cognitive load of grappling with poorly designed software will compound other anxiety-producing factors to unnecessarily reduce their scores.

The results of the Smarter Balanced tests for 2014–2015, when they are released, will further confuse the highly politicized national debate about Common Core and could cause its demise. The general public has every reason to believe that these results accurately reflect the state of mathematics education in this country, so they will not realize that the performance of students on these tests is lower, in significant part, to the poor craft of the test makers.

When poor results make headlines, will anyone point the finger in the direction of the test makers? Likely not. The new standards will be blamed and students and frontline educators at all levels will be attacked as incompetent—but the incompetent test makers will get a free pass.

There is no good reason for the tests to be this bad. The past forty years of extraordinary progress in research-directed development of mathematics visualization and technology for expressing mathematical reasoning could be put to use to power these tests—elegantly and effectively. The test makers failed to apply this research.

It’s no wonder that districts have decided to opt out of the new rounds of testing. Schools and school boards that participate in the testing should ignore the results—they are not accurate measures of mathematical understanding.

We must work to uncouple Common Core from the testing consortia and try to save the potential of Common Core, even while we let the tests, testing consortia, and their corporate partners crash and burn. The standards have value. These tests do not.

We can continue to research and develop well-crafted digital tools for mathematics education and work to deploy them in realistic time frames and in appropriate contexts.

We can demand the education funding necessary for teaching and assessing in this country in ways worthy of our students.

And maybe we can even make great assessments some day.