The ‘Value-Added Measurement’ movement in American education, implemented in part by the now-disgraced Michelle Rhee here in DC, has been a complete and utter failure, even as measured by its own yardsticks, as you will see below
Yet, the same corporate ‘reformers’ who were its major cheerleaders do not conclude from this that the idea was a bad one. Instead, they claim that it wasn’t tried with enough rigor and fidelity.
How to Learn Nothing from the Failure of VAM-Based Teacher Evaluation
The Annenberg Institute for School Reform is a most exclusive academic club lavishly funded and outfitted at Brown U. for the advancement of corporate education in America.
The Institute is headed by Susanna Loeb, who has a whole slew of degrees from prestigious universities, none of which has anything to do with the science and art of schooling, teaching, or learning.
Researchers at the Institute are circulating aworking paper that, at first glance, would suggest that school reformers might have learned something about the failure of teacher evaluation based on value-added models applied to student test scores. The abstract:
Starting in 2009, the U.S. public education system undertook a massive effort to institute new high-stakes teacher evaluation systems. We examine the effects of these reforms on student achievement and attainment at a national scale by exploiting the staggered timing of implementation across states. We find precisely estimated null effects, on average, that rule out impacts as small as 1.5 percent of a standard deviation for achievement and 1 percentage point for high school graduation and college enrollment. We also find little evidence of heterogeneous effects across an index measuring system design rigor, specific design features, and district characteristics. [my emphasis – GFB]
So could this mean that the national failure of VAM applied to teacher evaluation might translate to decreasing the brutalization of teachers and the waste of student learning time that resulted from the implementation of VAM beginning in 2009?
No such luck.
The conclusion of the paper, in fact, clearly shows that the Annenbergers have concluded that the failure to raise test scores by corporate accountability means (VAM) resulted from laggard states and districts that did not adhere strictly to the VAM’s mad methods. In short, the corporate-led failure of VAM in education happened as a result of schools not being corporate enough:
Firms in the private sector often fail to implement best management practices and performance evaluation systems because of imperfectly competitive markets and the costs of implementing such policies and practices (Bloom and Van Reenen 2007). These same factors are likely to have influenced the design and implementation of teacher evaluation reforms. Unlike firms in a perfectly competitive market with incentives to implement management and evaluation systems that increase productivity, school districts and states face less competitive pressure to innovate. Similarly, adopting evaluation systems like the one implemented in Washington D.C. requires a significant investment of time, money, and political capital. Many states may have believed that the costs of these investments outweighed the benefits. Consequently, the evaluation systems adopted by many states were not meaningfully different from the status quo and subsequently failed to improve student outcomes.
So the Gates-Duncan RTTT corporate plan for teacher evaluation failed not because it was a corporate model but because it was not corporate enough! In short, there were way too many small carrots and not enough big sticks.
Someone named David Banks will most likely be the next head of New York City’s public school system. Gary Rubinstein, a math teacher and prolific blogger at Stuyvesant HS there, had never heard of the fellow, even though Banks had founded and led a network of nearly a dozen NYC public schools called the Eagle Academy for Young Men. So Rubinstein looked at the public record, and found that on just about all measures that Reformsters use, those schools are mostly failures. However, none of the local news outlets (NY Times, NY Post, etc) appears to have examined that record.
I followed the links he gave, and found the following graphics for a number of the schools in that network. Feel free to explore some on your own by going here and then typing the word “eagle” and choosing any one of the schools.
Note the dark blue dots in the right hand graph in each case that I copied and pasted; they all, without exception, showed that the schools in that ensemble of schools, showed that they had both low average performance by the students AND had a low impact (meaning, they didn’t raise the academic performance of their students) by comparison with all of the other public schools in New York City.
Why do ‘reformers’ get a pass from the media, even though they never succeed at pulling off what they so boldly promise?
This is a long article printed on Alternet. A teacher believed so much all the hype about magical charter schools that she tried twice to form her own charter school. (She failed to get approval, rightly so, she says.)
So she went to work for an existing charter school (whose name she doesn’t provide) in Los Angeles and the scales fell from her eyes.
[Ed. Note: As DC’s office of the state superintendent of education (OSSE) seeks a waiver of PARCC testing again (recall that OSSE waived PARCC last school year due to the pandemic) and the DC auditor just released a bombshell report of poor stewardship of DC’s education data, it is time to revisit how standardized test data, teacher evaluations, and harsh school penalties were united by ed reformers in DCPS under mayoral control. This first-hand account of what went down in DCPS, the first of two parts by semi-retired educator Richard P. Phelps, appeared in Nonpartisan Education Review in September 2020 and is reprinted here with permission. The author thanks DC budget expert Mary Levy and retired DCPS teacher Erich Martel for their helpful comments in the research of this article.]
By Richard P. Phelps
Ten years ago, I worked as the director of assessments for DCPS. My tenure coincided with Michelle Rhee’s last nine months as chancellor. I departed shortly after Vincent Gray defeated Adrian Fenty in the September 2010 DC mayoral primary.
My primary task was to design an expansion of that testing program that served the IMPACT teacher evaluation system to include all core subjects and all grade levels. Despite its fame (or infamy), the test score aspect of the IMPACT program affected only 13% of teachers, those teaching either reading or math in grades four through eight. Only those subjects and grade levels included the requisite pre- and post-tests required for teacher “value added” measurements (VAM). Not included were most subjects (e.g., science, social studies, art, music, physical education), grades kindergarten to two, and high school.
Chancellor Rhee wanted many more teachers included. So, I designed a system that would cover more than half the DCPS teacher force, from kindergarten through high school. You haven’t heard about it because it never happened. The newly elected Vincent Gray had promised during his mayoral campaign to reduce the amount of testing; the proposed expansion would have increased it fourfold.
VAM affected teachers’ jobs. A low value-added score could lead to termination; a high score, to promotion and a cash bonus. VAM as it was then structured was obviously, glaringly flawed, as anyone with a strong background in educational testing could have seen. Unfortunately, among the many new central office hires from the elite of ed reform circles, none had such a background. (Even a primary grades teacher with the same group of students the entire school day had those students for less than six hours a day, five days a week, for less than half the year. All told, even in the highest exposure circumstances, a teacher interacted with the same group of students for less than a tenth of each student’s waking hours in a year, and for less than a twentieth in the tested subjects of English and math. In the lowest exposure circumstance, a high school teacher might interact with a class of English or math students for less than three percent of a student’s annual hours.
Before posting a request for proposals from commercial test developers for the testing expansion plan, I was instructed to survey two groups of stakeholders—central office managers and school-level teachers and administrators.
Not surprisingly, some of the central office managers consulted requested additions or changes to the proposed testing program where they thought it would benefit their domain of responsibility. The net effect on school-level personnel would have been to add to their administrative burden. Nonetheless, all requests from central office managers would be honored.
The Grand Tour
At about the same time, over several weeks of the late spring and early summer of 2010, along with a bright summer intern, I visited a dozen DCPS schools. The alleged purpose was to collect feedback on the design of the expanded testing program. I enjoyed these meetings. They were informative, animated, and very well attended. School staff appreciated the apparent opportunity to contribute to policy decisions and tried to make the most of it.
Each school greeted us with a full complement of faculty and staff on their days off, numbering a several dozen educators at some venues. They believed what we had told them: that we were in the process of redesigning the DCPS assessment program and were genuinely interested in their suggestions for how best to do it.
At no venue did we encounter stand-pat knee-jerk rejection of education reform efforts. Some educators were avowed advocates for the Rhee administration’s reform policies, but most were basically dedicated educators determined to do what was best for their community within the current context.
The Grand Tour was insightful, too. I learned for the first time of certain aspects of DCPS’s assessment system that were essential to consider in its proper design, aspects of which the higher-ups in the DCPS Central Office either were not aware or did not consider relevant.
The group of visited schools represented DCPS as a whole in appropriate proportions geographically, ethnically, and by education level (i.e., primary, middle, and high). Within those parameters, however, only schools with “friendly” administrations were chosen. That is, we only visited schools with principals and staff openly supportive of the Rhee-Henderson agenda.
But even they desired changes to the testing program, whether or not it was expanded. Their suggestions covered both the annual districtwide DC-CAS (or “comprehensive” assessment system), on which the teacher evaluation system was based, and the DC-BAS (or “benchmarking” assessment system), a series of four annual “no-stakes” interim tests unique to DCPS, ostensibly offered to help prepare students and teachers for the consequential-for-some-school-staff DC-CAS. (Though officially “no stakes,” some principals analyzed results from the DC-BAS to identify students whose scores lay just under the next higher benchmark and encouraged teachers to focus their instructional efforts on them. Moreover, at the high school level, where testing occurred only in grade 10, students who performed poorly on the DC-BAS might be artificially re-classified as held-back 9th graders or advanced prematurely to 11th grade in order to avoid the DC-CAS.)
At each staff meeting I asked for a show of hands on several issues of interest that I thought were actionable. Some suggestions for program changes received close to unanimous support. Allow me to describe several.
***Move DC-CAS test administration later in the school year. Many citizens may have logically assumed that the IMPACT teacher evaluation numbers were calculated from a standard pre-post test schedule, testing a teacher’s students at the beginning of their academic year together and then again at the end. In 2010, however, the DC-CAS was administered in March, three months before school year end. Moreover, that single administration of the test served as both pre- and post-test, posttest for the current school year and pretest for the following school year. Thus, before a teacher even met their new students in late August or early September, almost half of the year for which teachers were judged had already transpired—the three months in the spring spent with the previous year’s teacher and almost three months of summer vacation.
School staff recommended pushing DC-CAS administration to later in the school year. Furthermore, they advocated a genuine pre-post-test administration schedule—pre-test the students in late August–early September and post-test them in late-May–early June—to cover a teacher’s actual span of time with the students.
This suggestion was rejected because the test development firm with the DC-CAS contract required three months to score some portions of the test in time for the IMPACT teacher ratings scheduled for early July delivery, before the start of the new school year. Some small number of teachers would be terminated based on their IMPACT scores, so management demanded those scores be available before preparations for the new school year began. (Even a primary grades teacher with the same group of students the entire school day had those students for less than six hours a day, five days a week, for less than half the year. All told, even in the highest exposure circumstances, a teacher interacted with the same group of students for less than a tenth of each student’s waking hours in a year, and for less than a twentieth in the tested subjects of English and math. In the lowest exposure circumstance, a high school teacher might interact with a class of English or math students for less than three percent of a student’s annual hours.)
The tail wagged the dog.
***Add some stakes to the DC-CAS in the upper grades. Because DC-CAS test scores portended consequences for teachers but none for students, some students expended little effort on the test. Indeed, extensive research on “no-stakes” (for students) tests reveal that motivation and effort vary by a range of factors including gender, ethnicity, socioeconomic class, the weather, and age. Generally, the older the student, the lower the test-taking effort. This disadvantaged some teachers in the IMPACT ratings for circumstances beyond their control: unlucky student demographics.
Central office management rejected this suggestion to add even modest stakes to the upper grades’ DC-CAS; no reason given.
***Move one of the DC-BAS tests to year end. If management rejected the suggestion to move DC-CAS test administration to the end of the school year, school staff suggested scheduling one of the no-stakes DC-BAS benchmarking tests for late May–early June. As it was, the schedule squeezed all four benchmarking test administrations between early September and mid-February. Moving just one of them to the end of the year would give the following year’s teachers a more recent reading (by more than three months) of their new students’ academic levels and needs.
Central office management rejected this suggestion probably because the real purpose of the DC-BAS was not to help teachers understand their students’ academic levels and needs, as the following will explain.
***Change DC-BAS tests so they cover recently taught content. Many DC citizens probably assumed that, like most tests, the DC-BAS interim tests covered recently taught content, such as that covered since the previous test administration. Not so in 2010. The first annual DC-BAS was administered in early September, just after the year’s courses commenced. Moreover, it covered the same content domain—that for the entirety of the school year—as each of the next three DC-BAS tests.
School staff proposed changing the full-year “comprehensive” content coverage of each DC-BAS test to partial-year “cumulative” coverage, so students would only be tested on what they had been taught prior to each test administration.
This suggestion, too, was rejected. Testing the same full-year comprehensive content domain produced a predictable, flattering score rise. With each DC-BAS test administration, students recognized more of the content, because they had just been exposed to more of it, so average scores predictably rose. With test scores always rising, it looked like student achievement improved steadily each year. Achieving this contrived score increase required testing students on some material to which they had not yet been exposed, both a violation of professional testing standards and a poor method for instilling student confidence. (Of course, it was also less expensive to administer essentially the same test four times a year than to develop four genuinely different tests.)
***Synchronize the sequencing of curricular content across the District. DCPS management rhetoric circa 2010 attributed classroom-level benefits to the testing program. Teachers would know more about their students’ levels and needs and could also learn from each other. Yet, the only student test results teachers received at the beginning of each school year was half-a-year old, and most of the information they received over the course of four DC-BAS test administrations was based on not-yet-taught content.
As for cross-district teacher cooperation, unfortunately there was no cross-District coordination of common curricular sequences. Each teacher paced their subject matter however they wished and varied topical emphases according to their own personal preference.
It took DCPS’s chief academic officer, Carey Wright, and her chief of staff, Dan Gordon, less than a minute to reject the suggestion to standardize topical sequencing across schools so that teachers could consult with one another in real time. Tallying up the votes: several hundred school-level District educators favored the proposal, two of Rhee’s trusted lieutenants opposed it. It lost.
***Offer and require a keyboarding course in the early grades. DCPS was planning to convert all its testing from paper-and-pencil mode to computer delivery within a few years. Yet, keyboarding courses were rare in the early grades. Obviously, without systemwide keyboarding training in computer use some students would be at a disadvantage in computer testing.
Suggestion rejected.
In all, I had polled over 500 DCPS school staff. Not only were all of their suggestions reasonable, some were essential in order to comply with professional assessment standards and ethics.
Nonetheless, back at DCPS’s central office, each suggestion was rejected without, to my observation, any serious consideration. The rejecters included Chancellor Rhee, the head of the office of data and accountability—the self-titled “Data Lady,” Erin McGoldrick—and the head of the curriculum and instruction division, Carey Wright, and her chief deputy, Dan Gordon.
Four central office staff outvoted several hundred school staff (and my recommendations as assessment director). In each case, the changes recommended would have meant some additional work on their parts, but in return for substantial improvements in the testing program. Their rhetoric was all about helping teachers and students; but the facts were that the testing program wasn’t structured to help them.
What was the purpose of my several weeks of school visits and staff polling? To solicit “buy in” from school level staff, not feedback.
Ultimately, the new testing program proposal would incorporate all the new features requested by senior central office staff, no matter how burdensome, and not a single feature requested by several hundred supportive school-level staff, no matter how helpful. Like many others, I had hoped that the education reform intention of the Rhee-Henderson years was genuine. DCPS could certainly have benefitted from some genuine reform.
Alas, much of the activity labelled “reform” was just for show, and for padding resumes. Numerous central office managers would later work for the Bill and Melinda Gates Foundation. Numerous others would work for entities supported by the Gates or aligned foundations, or in jurisdictions such as Louisiana, where ed reformers held political power. Most would be well paid.
Their genuine accomplishments, or lack thereof, while at DCPS seemed to matter little. What mattered was the appearance of accomplishment and, above all, loyalty to the group. That loyalty required going along to get along: complicity in maintaining the façade of success while withholding any public criticism of or disagreement with other in-group members.
Unfortunately, in the United States what is commonly showcased as education reform is neither a civic enterprise nor a popular movement. Neither parents, the public, nor school-level educators have any direct influence. Rather, at the national level, U.S. education reform is an elite, private club—a small group of tightly connected politicos and academics—a mutual admiration society dedicated to the career advancement, political influence, and financial benefit of its members, supported by a gaggle of wealthy foundations (e.g., Gates, Walton, Broad, Wallace, Hewlett, Smith-Richardson).
For over a decade, The Ed Reform Club exploited DC for its own benefit. Local elite formed the DC Public Education Fund (DCPEF) to sponsor education projects, such as IMPACT, which they deemed worthy. In the negotiations between the Washington Teachers’ Union and DCPS concluded in 2010, DCPEF arranged a 3-year grant of $64.5 million from the Arnold, Broad, Robertson, and Walton foundations to fund a 5-year retroactive teacher pay raise in return for contract language allowing teacher excessing tied to IMPACT, which Rhee promised would lead to annual student test score increases by 2012. Projected goals were not met; foundation support continued nonetheless.
Michelle Johnson (nee Rhee) chaired the board of a charter school chain in California and occasionally collects $30,000+ in speaker fees but, otherwise, seems to have deliberately withdrawn from the limelight. Despite contributing her own additional scandals after she assumed the DCPS chancellorship, Kaya Henderson ascended to great fame and glory with a “distinguished professorship” at Georgetown; honorary degrees from Georgetown and Catholic universities; gigs with the Chan Zuckerberg Initiative, Broad Leadership Academy, and Teach for All; and board memberships with The Aspen Institute, The College Board, Robin Hood NYC, and Teach For America. Carey Wright is now state superintendent in Mississippi. Dan Gordon runs a 30-person consulting firm, Education Counsel, which strategically partners with major players in U.S. education policy. The manager of the IMPACT teacher evaluation program, Jason Kamras, now works as superintendent of the Richmond, VA public schools.
Arguably the person most directly responsible for the recurring assessment system fiascos of the Rhee-Henderson years, then chief of data and accountability Erin McGoldrick, now specializes in “data innovation” as partner and chief operating officer at an education management consulting firm. Her firm, Kitamba, strategically partners with its own panoply of major players in U.S. education policy. Its list of recent clients includes the DC Public Charter School Board and DCPS.
If the ambitious DC central office folk who gaudily declared themselves leading education reformers were not really, who were the genuine education reformers during the Rhee-Henderson decade of massive upheaval and per-student expenditures three times those in the state of Utah? They were the school principals and staff whose practical suggestions were ignored by central office glitterati. They were whistleblowers like history teacher Erich Martel, who had documented DCPS’s manipulation of student records and phony graduation rates years before the investigation of Ballou High School and was demoted and then “excessed” by Henderson. Or school principal Adell Cothorne, who spilled the beans on test answer sheet “erasure parties” at Noyes Education Campus and lost her job under Rhee.
Peter Greene at Curmudgucation gets it right again, even more when we realize that big business has always been lying about not having enough skilled workers. (see)
Democrats Need A New Theory Of Action Posted: 28 Dec 2020 07:24 AM PST For four years, Democrats have had a fairly simple theory of action when it came to education.
Something along the lines of “Good lord, a crazy lady just came into our china shop riding a bull, waving around a flamethrower, and dragging a shark with a head-mounted laser beam; we have to stop her from destroying the place (while pretending that we have a bull and a shark in the back just like hers).”
Now, of course, that will, thank heavens, no longer fit the circumstances. The Democrats will need a new plan.
Trouble is, the old plan, the one spanning both the Clinton and Obama years, is not a winner. It went, roughly, like this:
The way to fix poverty, racism, injustice, inequity and economic strife is to get a bunch of children to make higher scores on a single narrow standardized test; the best shot at getting this done is to give education amateurs the opportunity to make money doing it.
This was never, ever a good plan. Ever. Let me count the ways. For one thing, education’s ability to fix social injustice is limited. Having a better education will not raise the minimum wage. It will not eradicate poverty. And as we’ve just spent four years having hammered into us, it will not even be sure to make people better thinkers or cleanse them of racism. It will help some people escape the tar pit, but it will not cleanse the pit itself.
And that, of course, is simply talking about education, and that’s not what the Dems theory was about anyway–it was about a mediocre computer-scorable once-a-year test of math and reading. And that was never going to fix a thing. Nobody was going to get a better job because she got a high score on the PARCC. Nobody was ever going to achieve a happier, healthier life just because they’d raised their Big Standardized Test scores by fifty points. Any such score bump was always going to be the result of test prep and test-taker training, and that sort of preparation was always going to come at the expense of real education.
Now, a couple of decades on, all the evidence says that test-centric education didn’t improve society, schools, or the lives of the young humans who passed through the system.
Democrats must also wrestle with the fact that many of the ideas attached to this theory of action were always conservative ideas, always ideas that didn’t belong to traditional Democratic Party stuff at all.
Jack Schneider and Jennifer Berkshire talk about a “treaty” between Dems and the GOP, and that’s a way to look at how the ed reform movement brought people into each side who weren’t natural fits. The conservative market reform side teamed up with folks who believed choice was a matter of social justice, and that truce held until about four years ago, actually before Trump was elected.
Meanwhile, in Schneider and Berkshire’s telling, Democrats gave up supporting teachers (or at least their unions) while embracing the Thought Leadership of groups like Democrats for Education Reform, a group launched by hedge fund guys who adopted “Democrat” because it seemed like a good way to get the support they needed. Plus (and this seems like it was a thousand years ago) embracing “heroes” like Michelle Rhee, nominally listed as a Democrat, but certainly not acting like one.
All of this made a perfect soup for feeding neo-liberals. It had the additional effect of seriously muddying the water about what, exactly, Democrats stand for when it comes to public education. The laundry list of ideas now has two problems. One is that they have all been given a long, hard trial, and they’ve failed. The other, which is perhaps worse from a political gamesmanship standpoint, is that they have Trump/DeVos stink all over them.
But while Dems and the GOP share the problems with the first half of that statement, it’s the Democrats who have to own the second part. The amateur part.
I often complain that the roots of almost all our education woes for the modern reform period come from the empowerment of clueless amateurs, and while it may appear at first glance that both parties are responsible, on closer examination, I’m not so sure.
The GOP position hasn’t been that we need more amateurs and fewer professionals–their stance is that education is being run by the wrong profession. Eli Broad has built his whole edu-brand on the assertion that education doesn’t have education problems, it has business management problems, and that they will best be solved by management professionals.
In some regions, education has been reinterpreted by conservatives as a real estate problem, best solved by real estate professionals. The conservative model calls for education to be properly understood as a business, and as such, run not by elected bozos on a board or by a bunch of teachers, but by visionary CEOs with the power to hire and fire and set the rules and not be tied down by regulations and unions.
Democrats of the neo-liberal persuasion kind of agree with that last part. And they have taken it a step further by embracing the notion that all it takes to run a school is a vision, with no professional expertise of any sort at all.
I blame Democrats for the whole business of putting un-trained Best and Brightest Ivy Leaguers in classrooms, and the letting them turn around and use their brief classroom visit to establish themselves as “experts” capable of running entire district or even state systems. It takes Democrats to decide that a clueless amateur like David Coleman should be given a chance to impose his vision on the entire nation (and it takes right-tilted folks to see that this is a perfect chance to cash in big time).
Am I over-simplifying? Sure.
But you get the idea.
Democrats turned their backs on public education and the teaching profession. They decided that virtually every ill in society is caused by teachers with low expectations and lousy standards, and then they jumped on the bandwagon that insisted that somehow all of that could be fixed by making students take a Big Standardized Test and generating a pile of data that could be massaged for any and all purposes (never forget–No Child Left Behind was hailed as a great bi-partisan achievement). I would be far more excited about Biden if at any point in the campaign he had said something along the lines of, “Boy, did we get education policy wrong.”
And I suppose that’s a lot to ask.
But if Democrats are going to launch a new day in education, they have a lot to turn their backs on, along with a pressing need for a new theory of action. They need to reject the concept of an entire system built on the flawed foundation of a single standardized test. Operating with flawed data is, in fact, worse than no data at all, and for decades ed policy has been driven by folks looking for their car keys under a lamppost hundreds of feet away from where the keys were dropped because “the light’s better over here.”
They need to embrace the notion that teachers are, in fact, the pre-eminent experts in the field of education.
They need to accept that while education can be a powerful engine for pulling against the forces of inequity and injustice, but those forces also shape the environment within which schools must work.
They need to stop listening to amateurs. Success in other fields does not qualify someone to set education policy. Cruising through a classroom for two years does not make someone an education expert. Everyone who ever went to the doctor is not a medical expert, everyone who ever had their car worked on is not a mechanic, and everyone who ever went to school is not an education expert. Doesn’t mean they can’t add something to the conversation, but they shouldn’t be leading it.
They need to grasp that schools are not businesses. And not only are schools not businesses, but their primary function is not to supply businesses with useful worker bees. If they want to run multiple parallel education systems with charters and vouchers and all the rest, they need to face up to properly funding it. If they won’t do that, then they need to shut up about choicey policies.
“We can run three or four school systems for the cost of one” was always a lie, and it’s time to stop pretending otherwise. Otherwise school choice is just one more unfunded mandate. They need to accept that privatized school systems have not come up with anything new, revolutionary, or previously undiscovered about education. But they have come up with some clever new ways to waste and make off with taxpayer money.
Listen to teachers. Listen to parents in the community served by the school. Commit to a search for long term solutions instead of quick fixy silver bullets. And maybe become a force for public education slightly more useful than simply fending off a crazy lady with a flamethrower.
Ten years ago, I worked as the Director of Assessments for the District of Columbia Public Schools (DCPS). For temporal context, I arrived after the first of the infamous test cheating scandals and left just before the incident that spawned a second. Indeed, I filled a new position created to both manage test security and design an expanded testing program. I departed shortly after Vincent Gray, who opposed an expanded testing program, defeated Adrian Fenty in the September 2010 DC mayoral primary. My tenure coincided with Michelle Rhee’s last nine months as Chancellor.
The recurring test cheating scandals of the Rhee-Henderson years may seem extraordinary but, in fairness, DCPS was more likely than the average US school district to be caught because it received a much higher degree of scrutiny. Given how tests are typically administered in this country, the incidence of cheating is likely far greater than news accounts suggest, for several reasons:
· in most cases, those who administer tests—schoolteachers and administrators—have an interest in their results;
· test security protocols are numerous and complicated yet, nonetheless, the responsibility of non-expert ordinary school personnel, guaranteeing their inconsistent application across schools and over time;
· after-the-fact statistical analyses are not legal proof—the odds of a certain amount of wrong-to-right erasures in a single classroom on a paper-and-pencil test being coincidental may be a thousand to one, but one-in-a-thousand is still legally plausible; and
· after-the-fact investigations based on interviews are time-consuming, scattershot, and uneven.
Still, there were measures that the Rhee-Henderson administrations could have adopted to substantially reduce the incidence of cheating, but they chose none that might have been effective. Rather, they dug in their heels, insisted that only a few schools had issues, which they thoroughly resolved, and repeatedly denied any systematic problem.
Cheating scandals
From 2007 to 2009 rumors percolated of an extraordinary level of wrong-to-right erasures on the test answer sheets at many DCPS schools. “Erasure analysis” is one among several “red flag” indicators that testing contractors calculate to monitor cheating. The testing companies take no responsibility for investigating suspected test cheating, however; that is the customer’s, the local or state education agency.
In her autobiographical account of her time as DCPS Chancellor, Michelle Johnson (nee Rhee), wrote (p. 197)
“For the first time in the history of DCPS, we brought in an outside expert to examine and audit our system. Caveon Test Security – the leading expert in the field at the time – assessed our tests, results, and security measures. Their investigators interviewed teachers, principals, and administrators.
“Caveon found no evidence of systematic cheating. None.”
Caveon, however, had not looked for “systematic” cheating. All they did was interview a few people at several schools where the statistical anomalies were more extraordinary than at others. As none of those individuals would admit to knowingly cheating, Caveon branded all their excuses as “plausible” explanations. That’s it; that is all that Caveon did. But, Caveon’s statement that they found no evidence of “widespread” cheating—despite not having looked for it—would be frequently invoked by DCPS leaders over the next several years.[1]
Incidentally, prior to the revelation of its infamous decades-long, systematic test cheating, the Atlanta Public Schools had similarly retained Caveon Test Security and was, likewise, granted a clean bill of health. Only later did the Georgia state attorney general swoop in and reveal the truth.
In its defense, Caveon would note that several cheating prevention measures it had recommended to DCPS were never adopted.[2] None of the cheating prevention measures that I recommended were adopted, either.
The single most effective means for reducing in-classroom cheating would have been to rotate teachers on test days so that no teacher administered a test to his or her own students. It would not have been that difficult to randomly assign teachers to different classrooms on test days.
The single most effective means for reducing school administratorcheating would have been to rotate test administrators on test days so that none managed the test materials for their own schools. The visiting test administrators would have been responsible for keeping test materials away from the school until test day, distributing sealed test booklets to the rotated teachers on test day, and for collecting re-sealed test booklets at the end of testing and immediately removing them from the school.
Instead of implementing these, or a number of other feasible and effective test security measures, DCPS leaders increased the number of test proctors, assigning each of a few dozen or so central office staff a school to monitor. Those proctors could not reasonably manage the volume of oversight required. A single DC test administration could encompass a hundred schools and a thousand classrooms.
Investigations
So, what effort, if any, did DCPS make to counter test cheating? They hired me, but then rejected all my suggestions for increasing security. Also, they established a telephone tip line. Anyone who suspected cheating could report it, even anonymously, and, allegedly, their tip would be investigated.
Some forms of cheating are best investigated through interviews. Probably the most frequent forms of cheating at DCPS—teachers helping students during test administrations and school administrators looking at test forms prior to administration—leave no statistical residue. Eyewitness testimony is the only type of legal evidence available in such cases, but it is not just inconsistent, it may be socially destructive.
I remember two investigations best: one occurred in a relatively well-to-do neighborhood with well-educated parents active in school affairs; the other in one of the city’s poorest neighborhoods. Superficially, the cases were similar—an individual teacher was accused of helping his or her own students with answers during test administrations. Making a case against either elementary school teacher required sworn testimony from eyewitnesses, that is, students—eight-to-ten-year olds.
My investigations, then, consisted of calling children into the principal’s office one-by-one to be questioned about their teacher’s behavior. We couldn’t hide the reason we were asking the questions. And, even though each student agreed not to tell others what had occurred in their visit to the principal’s office, we knew we had only one shot at an uncorrupted jury pool.
Though the accusations against the two teachers were similar and the cases against them equally strong, the outcomes could not have been more different. In the high-poverty neighborhood, the students seemed suspicious and said little; none would implicate the teacher, whom they all seemed to like.
In the more prosperous neighborhood, students were more outgoing, freely divulging what they had witnessed. The students had discussed the alleged coaching with their parents who, in turn, urged them to tell investigators what they knew. During his turn in the principal’s office, the accused teacher denied any wrongdoing. I wrote up each interview, then requested that each student read and sign.
Thankfully, that accused teacher made a deal and left the school system a few weeks later. Had he not, we would have required the presence in court of the eight-to-ten-year olds to testify under oath against their former teacher, who taught multi-grade classes. Had that prosecution not succeeded, the eyewitness students could have been routinely assigned to his classroom the following school year.
My conclusion? Only in certain schools is the successful prosecution of a cheating teacher through eyewitness testimony even possible. But, even where possible, it consumes inordinate amounts of time and, otherwise, comes at a high price, turning young innocents against authority figures they naturally trusted.
Cheating blueprints
Arguably the most widespread and persistent testing malfeasance in DCPS received little attention from the press. Moreover, it was directly propagated by District leaders, who published test blueprints on the web. Put simply, test “blueprints” are lists of the curricular standards (e.g., “student shall correctly add two-digit numbers”) and the number of test items included in an upcoming test related to each standard. DC had been advance publishing its blueprints for years.
I argued that the way DC did it was unethical. The head of the Division of Data & Accountability, Erin McGoldrick, however, defended the practice, claimed it was common, and cited its existence in the state of California as precedent. The next time she and I met for a conference call with one of DCPS’s test providers, Discover Education, I asked their sales agent how many of their hundreds of other customers advance-published blueprints. His answer: none.
In the state of California, the location of McGoldrick’s only prior professional experience, blueprints were, indeed, published in advance of test administrations. But their tests were longer than DC’s and all standards were tested. Publication of California’s blueprints served more to remind the populace what the standards were in advance of each test administration. Occasionally, a standard considered to be of unusual importance might be assigned a greater number of test items than the average, and the California blueprints signaled that emphasis.
In Washington, DC, the tests used in judging teacher performance were shorter, covering only some of each year’s standards. So, DC’s blueprints showed everyone well in advance of the test dates exactly which standards would be tested and which would not. For each teacher, this posed an ethical dilemma: should they “narrow the curriculum” by teaching only that content they knew would be tested? Or, should they do the right thing and teach all the standards, as they were legally and ethically bound to, even though it meant spending less time on the to-be-tested content? It’s quite a conundrum when one risks punishment for behaving ethically.
Monthly meetings convened to discuss issues with the districtwide testing program, the DC Comprehensive Assessment System (DC-CAS)—administered to comply with the federal No Child Left Behind (NCLB) Act. All public schools, both DCPS and charters, administered those tests. At one of these regular meetings, two representatives from the Office of the State Superintendent of Education (OSSE) announced plans to repair the broken blueprint process.[3]
The State Office employees argued thoughtfully and reasonably that it was professionally unethical to advance publish DC test blueprints. Moreover, they had surveyed other US jurisdictions in an effort to find others that followed DC’s practice and found none. I was the highest-ranking DCPS employee at the meeting and I expressed my support, congratulating them for doing the right thing. I assumed that their decision was final.
I mentioned the decision to McGoldrick, who expressed surprise and speculation that it might have not been made at the highest level in the organizational hierarchy. Wasting no time, she met with other DCPS senior managers and the proposed change was forthwith shelved. In that, and other ways, the DCPS tail wagged the OSSE dog.
* * *
It may be too easy to finger ethical deficits for the recalcitrant attitude toward test security of the Rhee-Henderson era ed reformers. The columnist Peter Greene insists that knowledge deficits among self-appointed education reformers also matter:
“… the reformistan bubble … has been built from Day One without any actual educators inside it. Instead, the bubble is populated by rich people, people who want rich people’s money, people who think they have great ideas about education, and even people who sincerely want to make education better. The bubble does not include people who can turn to an Arne Duncan or a Betsy DeVos or a Bill Gates and say, ‘Based on my years of experience in a classroom, I’d have to say that idea is ridiculous bullshit.’”
“There are a tiny handful of people within the bubble who will occasionally act as bullshit detectors, but they are not enough. The ed reform movement has gathered power and money and set up a parallel education system even as it has managed to capture leadership roles within public education, but the ed reform movement still lacks what it has always lacked–actual teachers and experienced educators who know what the hell they’re talking about.”
In my twenties, I worked for several years in the research department of a state education agency. My primary political lesson from that experience, consistently reinforced subsequently, is that most education bureaucrats tell the public that the system they manage works just fine, no matter what the reality. They can get away with this because they control most of the evidence and can suppress it or spin it to their advantage.
In this proclivity, the DCPS central office leaders of the Rhee-Henderson era proved themselves to be no different than the traditional public-school educators they so casually demonized.
US school systems are structured to be opaque and, it seems, both educators and testing contractors like it that way. For their part, and contrary to their rhetoric, Rhee, Henderson, and McGoldrick passed on many opportunities to make their system more transparent and accountable.
[1] A perusal of Caveon’s website clarifies that their mission is to help their clients–state and local education departments–not get caught. Sometimes this means not cheating in the first place; other times it might mean something else. One might argue that, ironically, Caveon could be helping its clients to cheat in more sophisticated ways and cover their tracks better.
[2] Among them: test booklets should be sealed until the students open them and resealed by the students immediately after; and students should be assigned seats on test day and a seating chart submitted to test coordinators (necessary for verifying cluster patterns in student responses that would suggest answer copying).
[3] Yes, for those new to the area, the District of Columbia has an Office of the “State” Superintendent of Education (OSSE). Its domain of relationships includes not just the regular public schools (i.e., DCPS), but also other public schools (i.e., charters) and private schools. Practically, it primarily serves as a conduit for funneling money from a menagerie of federal education-related grant and aid programs
Short answer: nothing that would actually help students or teachers. But it’s made for well-padded resumes for a handful of insiders.
This is an important review, by the then-director of assessment. His criticisms echo the points that I have been making along with Mary Levy, Erich Martel, Adell Cothorne, and many others.
Looking Back on DC Education Reform 10 Years After,
Part 1: The Grand Tour
Richard P Phelps
Ten years ago, I worked as the Director of Assessments for the District of Columbia Public Schools (DCPS). My tenure coincided with Michelle Rhee’s last nine months as Chancellor. I departed shortly after Vincent Gray defeated Adrian Fenty in the September 2010 DC mayoral primary.
My primary task was to design an expansion of that testing program that served the IMPACT teacher evaluation system to include all core subjects and all grade levels. Despite its fame (or infamy), the test score aspect of the IMPACT program affected only 13% of teachers, those teaching either reading or math in grades four through eight. Only those subjects and grade levels included the requisite pre- and post-tests required for teacher “value added” measurements (VAM). Not included were most subjects (e.g., science, social studies, art, music, physical education), grades kindergarten to two, and high school.
Chancellor Rhee wanted many more teachers included. So, I designed a system that would cover more than half the DCPS teacher force, from kindergarten through high school. You haven’t heard about it because it never happened. The newly elected Vincent Gray had promised during his mayoral campaign to reduce the amount of testing; the proposed expansion would have increased it fourfold.
VAM affected teachers’ jobs. A low value-added score could lead to termination; a high score, to promotion and a cash bonus. VAM as it was then structured was obviously, glaringly flawed,[1] as anyone with a strong background in educational testing could have seen. Unfortunately, among the many new central office hires from the elite of ed reform circles, none had such a background.
Before posting a request for proposals from commercial test developers for the testing expansion plan, I was instructed to survey two groups of stakeholders—central office managers and school-level teachers and administrators.
Not surprisingly, some of the central office managers consulted requested additions or changes to the proposed testing program where they thought it would benefit their domain of responsibility. The net effect on school-level personnel would have been to add to their administrative burden. Nonetheless, all requests from central office managers would be honored.
The Grand Tour
At about the same time, over several weeks of the late Spring and early Summer of 2010, along with a bright summer intern, I visited a dozen DCPS schools. The alleged purpose was to collect feedback on the design of the expanded testing program. I enjoyed these meetings. They were informative, animated, and very well attended. School staff appreciated the apparent opportunity to contribute to policy decisions and tried to make the most of it.
Each school greeted us with a full complement of faculty and staff on their days off, numbering a several dozen educators at some venues. They believed what we had told them: that we were in the process of redesigning the DCPS assessment program and were genuinely interested in their suggestions for how best to do it.
At no venue did we encounter stand-pat knee-jerk rejection of education reform efforts. Some educators were avowed advocates for the Rhee administration’s reform policies, but most were basically dedicated educators determined to do what was best for their community within the current context.
The Grand Tour was insightful, too. I learned for the first time of certain aspects of DCPS’s assessment system that were essential to consider in its proper design, aspects of which the higher-ups in the DCPS Central Office either were not aware or did not consider relevant.
The group of visited schools represented DCPS as a whole in appropriate proportions geographically, ethnically, and by education level (i.e., primary, middle, and high). Within those parameters, however, only schools with “friendly” administrations were chosen. That is, we only visited schools with principals and staff openly supportive of the Rhee-Henderson agenda.
But even they desired changes to the testing program, whether or not it was expanded. Their suggestions covered both the annual districtwide DC-CAS (or “comprehensive” assessment system), on which the teacher evaluation system was based, and the DC-BAS (or “benchmarking” assessment system), a series of four annual “no-stakes” interim tests unique to DCPS, ostensibly offered to help prepare students and teachers for the consequential-for-some-school-staff DC-CAS.[2]
At each staff meeting I asked for a show of hands on several issues of interest that I thought were actionable. Some suggestions for program changes received close to unanimous support. Allow me to describe several.
1.Move DC-CAS test administration later in the school year. Many citizens may have logically assumed that the IMPACT teacher evaluation numbers were calculated from a standard pre-post test schedule, testing a teacher’s students at the beginning of their academic year together and then again at the end. In 2010, however, the DC-CAS was administered in March, three months before school year end. Moreover, that single administration of the test served as both pre- and post-test, posttest for the current school year and pretest for the following school year. Thus, before a teacher even met their new students in late August or early September, almost half of the year for which teachers were judged had already transpired—the three months in the Spring spent with the previous year’s teacher and almost three months of summer vacation.
School staff recommended pushing DC-CAS administration to later in the school year. Furthermore, they advocated a genuine pre-post-test administration schedule—pre-test the students in late August–early September and post-test them in late-May–early June—to cover a teacher’s actual span of time with the students.
This suggestion was rejected because the test development firm with the DC-CAS contract required three months to score some portions of the test in time for the IMPACT teacher ratings scheduled for early July delivery, before the start of the new school year. Some small number of teachers would be terminated based on their IMPACT scores, so management demanded those scores be available before preparations for the new school year began.[3] The tail wagged the dog.
2.Add some stakes to the DC-CAS in the upper grades. Because DC-CAS test scores portended consequences for teachers but none for students, some students expended little effort on the test. Indeed, extensive research on “no-stakes” (for students) tests reveal that motivation and effort vary by a range of factors including gender, ethnicity, socioeconomic class, the weather, and age. Generally, the older the student, the lower the test-taking effort. This disadvantaged some teachers in the IMPACT ratings for circumstances beyond their control: unlucky student demographics.
Central office management rejected this suggestion to add even modest stakes to the upper grades’ DC-CAS; no reason given.
3.Move one of the DC-BAS tests to year end. If management rejected the suggestion to move DC-CAS test administration to the end of the school year, school staff suggested scheduling one of the no-stakes DC-BAS benchmarking tests for late May–early June. As it was, the schedule squeezed all four benchmarking test administrations between early September and mid-February. Moving just one of them to the end of the year would give the following year’s teachers a more recent reading (by more than three months) of their new students’ academic levels and needs.
Central Office management rejected this suggestion probably because the real purpose of the DC-BAS was not to help teachers understand their students’ academic levels and needs, as the following will explain.
4.Change DC-BAS tests so they cover recently taught content. Many DC citizens probably assumed that, like most tests, the DC-BAS interim tests covered recently taught content, such as that covered since the previous test administration. Not so in 2010. The first annual DC-BAS was administered in early September, just after the year’s courses commenced. Moreover, it covered the same content domain—that for the entirety of the school year—as each of the next three DC-BAS tests.
School staff proposed changing the full-year “comprehensive” content coverage of each DC-BAS test to partial-year “cumulative” coverage, so students would only be tested on what they had been taught prior to each test administration.
This suggestion, too, was rejected. Testing the same full-year comprehensive content domain produced a predictable, flattering score rise. With each DC-BAS test administration, students recognized more of the content, because they had just been exposed to more of it, so average scores predictably rose. With test scores always rising, it looked like student achievement improved steadily each year. Achieving this contrived score increase required testing students on some material to which they had not yet been exposed, both a violation of professional testing standards and a poor method for instilling student confidence. (Of course, it was also less expensive to administer essentially the same test four times a year than to develop four genuinely different tests.)
5.Synchronize the sequencing of curricular content across the District. DCPS management rhetoric circa 2010 attributed classroom-level benefits to the testing program. Teachers would know more about their students’ levels and needs and could also learn from each other. Yet, the only student test results teachers received at the beginning of each school year was half-a-year old, and most of the information they received over the course of four DC-BAS test administrations was based on not-yet-taught content.
As for cross-district teacher cooperation, unfortunately there was no cross-District coordination of common curricular sequences. Each teacher paced their subject matter however they wished and varied topical emphases according to their own personal preference.
It took DCPS’s Chief Academic Officer, Carey Wright, and her chief of staff, Dan Gordon, less than a minute to reject the suggestion to standardize topical sequencing across schools so that teachers could consult with one another in real time. Tallying up the votes: several hundred school-level District educators favored the proposal, two of Rhee’s trusted lieutenants opposed it. It lost.
6.Offer and require a keyboarding course in the early grades. DCPS was planning to convert all its testing from paper-and-pencil mode to computer delivery within a few years. Yet, keyboarding courses were rare in the early grades. Obviously, without systemwide keyboarding training in computer use some students would be at a disadvantage in computer testing.
Suggestion rejected.
In all, I had polled over 500 DCPS school staff. Not only were all of their suggestions reasonable, some were essential in order to comply with professional assessment standards and ethics.
Nonetheless, back at DCPS’ Central Office, each suggestion was rejected without, to my observation, any serious consideration. The rejecters included Chancellor Rhee, the head of the office of Data and Accountability—the self-titled “Data Lady,” Erin McGoldrick—and the head of the curriculum and instruction division, Carey Wright, and her chief deputy, Dan Gordon.
Four central office staff outvoted several-hundred school staff (and my recommendations as assessment director). In each case, the changes recommended would have meant some additional work on their parts, but in return for substantial improvements in the testing program. Their rhetoric was all about helping teachers and students; but the facts were that the testing program wasn’t structured to help them.
What was the purpose of my several weeks of school visits and staff polling? To solicit “buy in” from school level staff, not feedback.
Ultimately, the new testing program proposal would incorporate all the new features requested by senior Central Office staff, no matter how burdensome, and not a single feature requested by several hundred supportive school-level staff, no matter how helpful. Like many others, I had hoped that the education reform intention of the Rhee-Henderson years was genuine. DCPS could certainly have benefitted from some genuine reform.
Alas, much of the activity labelled “reform” was just for show, and for padding resumes. Numerous central office managers would later work for the Bill and Melinda Gates Foundation. Numerous others would work for entities supported by the Gates or aligned foundations, or in jurisdictions such as Louisiana, where ed reformers held political power. Most would be well paid.
Their genuine accomplishments, or lack thereof, while at DCPS seemed to matter little. What mattered was the appearance of accomplishment and, above all, loyalty to the group. That loyalty required going along to get along: complicity in maintaining the façade of success while withholding any public criticism of or disagreement with other in-group members.
Unfortunately, in the United States what is commonly showcased as education reform is neither a civic enterprise nor a popular movement. Neither parents, the public, nor school-level educators have any direct influence. Rather, at the national level, US education reform is an elite, private club—a small group of tightly-connected politicos and academics—a mutual admiration society dedicated to the career advancement, political influence, and financial benefit of its members, supported by a gaggle of wealthy foundations (e.g., Gates, Walton, Broad, Wallace, Hewlett, Smith-Richardson).
For over a decade, The Ed Reform Club exploited DC for its own benefit. Local elite formed the DC Public Education Fund (DCPEF) to sponsor education projects, such as IMPACT, which they deemed worthy. In the negotiations between the Washington Teachers’ Union and DCPS concluded in 2010, DCPEF arranged a 3 year grant of $64.5M from the Arnold, Broad, Robertson and Walton Foundations to fund a 5-year retroactive teacher pay raise in return for contract language allowing teacher excessing tied to IMPACT, which Rhee promised would lead to annual student test score increases by 2012. Projected goals were not met; foundation support continued nonetheless.
Michelle Johnson (nee Rhee) now chairs the board of a charter school chain in California and occasionally collects $30,000+ in speaker fees but, otherwise, seems to have deliberately withdrawn from the limelight. Despite contributing her own additional scandalsafter she assumed the DCPS Chancellorship, Kaya Henderson ascended to great fame and glory with a “distinguished professorship” at Georgetown; honorary degrees from Georgetown and Catholic Universities; gigs with the Chan Zuckerberg Initiative, Broad Leadership Academy, and Teach for All; and board memberships with The Aspen Institute, The College Board, Robin Hood NYC, and Teach For America. Carey Wright is now state superintendent in Mississippi. Dan Gordon runs a 30-person consulting firm, Education Counsel that strategically partners with major players in US education policy. The manager of the IMPACT teacher evaluation program, Jason Kamras, now works as Superintendent of the Richmond, VA public schools.
Arguably the person most directly responsible for the recurring assessment system fiascos of the Rhee-Henderson years, then Chief of Data and Accountability Erin McGoldrick, now specializes in “data innovation” as partner and chief operating officer at an education management consulting firm. Her firm, Kitamba, strategically partners with its own panoply of major players in US education policy. Its list of recent clients includes the DC Public Charter School Board and DCPS.
If the ambitious DC central office folk who gaudily declared themselves leading education reformers were not really, who were the genuine education reformers during the Rhee-Henderson decade of massive upheaval and per-student expenditures three times those in the state of Utah? They were the school principals and staff whose practical suggestions were ignored by central office glitterati. They were whistleblowers like history teacher Erich Martel who had documented DCPS’ student records’ manipulation and phony graduation rates years before the Washington Post’s celebrated investigation of Ballou High School, and was demoted and then “excessed” by Henderson. Or, school principal Adell Cothorne, who spilled the beans on test answer sheet “erasure parties” at Noyes Education Campus and lost her job under Rhee.
Real reformers with “skin in the game” can’t play it safe.
The author appreciates the helpful comments of Mary Levy and Erich Martel in researching this article.
Peter Greene has provided a nice flow chart to let you decide whether you should open your mouth with your ideas on how and whether to re-open the public schools, or whether you should just be quiet and listen.
So, should you just hush, or do you have something valuable to contribute to this subject?
My wife and I each taught for 30 years or so, and so we would be in the ‘speak right up’ category, but I don’t really know how the USA can get public education to work next year, especially since the danger is not going away, but apparently once more growing at an exponential clip.
Nobody should be listening to billionaires or their bought-and-paid-for policy wonks who once spent a whole two years in a classroom.
A few quotes from Greene’s column. (He is a much better writer than me, and much more original as well.)
==================================
To everyone who was never a classroom teacher but who has some ideas about how school should be reopened in the fall:
Hush.
Just hush.
There are some special categories of life experiences. Divorce. Parenthood. Deafness. Living as a Black person in the US. Classroom teacher. They are very different experiences, but they all have on thing in common.
You can read about these things. But if you haven’t lived it, you don’t know. You can study up, read up, talk to people. And in some rare cases that brings you close enough to knowing that your insights might actually be useful.
But mostly, you are a Dunning-Krueger case study just waiting to be written up.
The last thirty-seven-ish years of education have been marked by one major feature– a whole lot of people who just don’t know, throwing their weight around and trying to set the conditions under which the people who actually do the work will have to try to actually do the work. Policy wonks, privateers, Teach for America pass-throughs, guys who wanted to run for President, folks walking by on the street who happen to be filthy rich, amateurs who believe their ignorance is a qualification– everyone has stuck their oar in to try to reshape US education. And in ordinary times, as much as I argue against these folks, I would not wave my magic wand to silence them, because 1) educators are just as susceptible as anyone to becoming too insular and entrenched and convinced of their own eternal rightness and 2) it is a teacher’s job to serve all those amateurs, so it behooves the education world to listen, even if what they hear is 98% bosh.
But that’s in ordinary times, and these are not ordinary times.
There’s a whole lot of discussion about the issues involved in starting up school this fall. The discussion is made difficult by the fact that all options stink. It is further complicated by the loud voices of people who literally do not know what they are talking about.
He thought that statistical methods that are useful with farm animals could also be used to measure effectiveness of teachers.
I grew up on a farm, and as both a kid and a young man I had considerable experience handling cows, chickens, and sheep. (These are generic critter photos, not the actual animals we had.)
I also taught math and some science to kids like the ones shown below for over 30 years.
Caring for farm animals and teaching young people are not the same thing.
(Duh.)
As the saying goes: “Teaching isn’t rocket science. It’s much harder.”
I am quite sure that with careful measurements of different types of feed, medications, pasturage, and bedding, it is quite possible to figure out which mix of those elements might help or hinder the production of milk and cream from dairy cows. That’s because dairy or meat cattle (or chickens, or sheep, or pigs) are pretty simple creatures: all a farmer wants is for them to produce lots of high-quality milk, meat, wool, or eggs for the least cost to the farmer, and without getting in trouble.
William Sanders was well-known for his statistical work with dairy cows. His step into hubris and nuttiness was to translate this sort of mathematics to little humans. From Wikipedia:
“The model has prompted numerous federal lawsuits charging that the evaluation system, which is now tied to teacher pay and tenure in Tennessee, doesn’t take into account student-level variables such as growing up in poverty. In 2014, the American Statistical Association called its validity into question, and other critics have said TVAAS should not be the sole tool used to judge teachers.”
But there are several problems with this.
We don’t have an easily-defined and nationally-agreed upon goal for education that we can actually measure. If you don’t believe this, try asking a random set of people what they think should be primary the goal of education, and listen to all the different ideas!
It’s certainly not just ‘higher test scores’ — the math whizzes who brought us “collateralization of debt-swap obligations in leveraged financings” surely had exceedingly high math test scores, but I submit that their character education (as in, ‘not defrauding the public’) was lacking. In their selfishness and hubris, they have succeeded in nearly bankrupting the world economy while buying themselves multiple mansions and yachts, yet causing misery to billions living in slums around the world and millions here in the US who lost their homes and are now sleeping in their cars.
Is our goal also to ‘educate’ our future generations for the lowest cost? Given the prices for the best private schools and private tutors, it is clear that the wealthy believe that THEIR children should be afforded excellent educations that include very small classes, sports, drama, music, free play and exploration, foreign languages, writing, literature, a deep understanding and competency in mathematics & all of the sciences, as well as a solid grounding in the social sciences (including history, civics, and character education). Those parents realize that a good education is expensive, so they ‘throw money at the problem’. Unfortunately, the wealthy don’t want to do the same for the children of the poor.
Reducing the goals of education to just a student’s scores on secretive tests in just two subjects, and claiming that it’s possible to tease out the effectiveness of ANY teacher, even those who teach neither English/Language Arts or Math, is madness.
Why? Study after study (not by Sanders, of course) has shown that the actual influence of any given teacher on a student is only from 1% of 14% of test scores. By far the greatest influence is from the student’s own family background, not the ability of a single teacher to raise test scores in April. (An effect which I have shown is chimerical — the effect one year is mostly likely completely different the next year!)
By comparison, a cow’s life is pretty simple. They eat whatever they are given (be that straw, shredded newspaper, cotton seeds, chicken poop mixed with sawdust, or even the dregs from squeezing out orange juice [no, I’m not making that up.]. Cows also poop, drink, pee, chew their cud, and sometimes they try to bully each other. If it’s a dairy cow, it gets milked twice a day, every day, at set times. If it’s a steer, he/it mostly sits around and eats (and poops and pees) until it’s time to send them off to the slaughterhouse. That’s pretty much it.
Gary Rubinstein and I have dissected the value-added scores for New York City public school teachers that were computed and released by the New York Times. We both found that for any given teacher who taught the same subject matter and grade level in the very same school over the period of the NYT data, there was almost NO CORRELATION between their scores for one year to the next.
We also showed that teachers who were given scores in both math and reading (say, elementary teachers), there was almost no correlation between their scores in math and in reading.
Furthermore, with teachers who were given scores in a single subject (say, math) but at different grade levels (say, 6th and 7th grade math), you guessed it: extremely low correlation.
In other words, it seemed to act like a very, very expensive and complicated random-number generator.
People have much, much more complicated inputs, and much more complicated outputs. Someone should have written on William Sanders’ tombstone the phrase “People are not cattle.”
Interesting fact: Jason Kamras was considered to be the architect of Value-Added measurement for teachers in Washington, DC, implemented under the notorious and now-disgraced Michelle Rhee. However, when he left DC to become head of Richmond VA public schools, he did not bring it with him.
I wish I could write half as well as, or as much as, Diane Ravitch manages to do, every single day. I also admire her dedication to fighting the billionaires who have been dictating education policy in the USA for quite some time.
If you are reading this post, you are no doubt aware that only ten years ago, Ravitch did a 180-degree turn on major education issues, admitted she had been wrong on a number of points, and became one of the major forces fighting against the disruptive education-privatization agenda of the billionaires.
Since that time, she has been documenting on her blog, several times a day, nearly every day, the utter failures of the extremely wealthy amateurs who have been claiming to ‘reform’ education, but who have instead merely been disrupting it and failing to achieve any of the goals that they confidently predicted would be won, even using their own yard-sticks.
I found DR’smost recent book (pictured above) to be an excellent history of the past 37 years wherein certain billionaires, and their well-paid acolytes, have claimed that the American public school system is a total failure and needed to be torn down and rebuilt through these steps:
Pretending that American students were at one point the highest-scoring ones on the planet (which has NEVER been true) and that the fact that they currently score at middling levels on international tests like PISA is a cause for national alarm;
Claiming that student family poverty does not cause lower student achievement (however measured), but the reverse: that the schools that have students from poor and non-white populations are the CAUSE of that poverty and low achievement;
Fraudulently assuming that huge fractions of teachers are not only incompetent but actively oppress their students (particularly the poor, the brown, and the black) and need to be fired en masse (as they were in New Orleans, Rhode Island, and Washington, DC);
Micromanaging teachers in various ways, including by forcing all states to adopt a never-tested and largely incomprehensible ‘Common Core’ curriculum and demanding that all teachers follow scripted lessons in lockstep;
‘Measuring’ the productivity of teachers through arcane and impenetrable ‘Value-Added’ schemes that were devised for dairy cows;
Mass firings of certified teachers, particularly African-American ones (see #2) and replacing them either with untrained, mostly-white newbies from Teach for America or with computers;
Requiring public and charter schools (but not vouchers) to spend ever-larger fractions of their classroom time on test prep instead of real learning;
Turning billions of public funds over to wealthy amateurs (and con artists) with no educational experience to set up charter schools and voucher schools with no real accountability — the very worst ones being the online charter schools.
One great aspect of this book is that Ravitch points out how
All of those claims and ‘solutions’ have failed (for example, a study in Texas showed charter schools had no impact on test scores and a negative impact on earnings (p. 82);
Teachers, parents, students, and ordinary community members have had a good deal of success in fighting back.
I will conclude with a number of quotes from the book in random colors.
“How many more billions will be required to lift charter school enrollment to 10 percent? [It’s now about 5 percent] And why is it worth the investment, given that charter schools, unless they cherry-pick their students, are no more successful than public schools are and often far worse? Why should the federal government spend nearly half a billion dollars on charter schools that may never open when there are so many desperately underfunded public schools?” (p. 276-277)
“Any movement controlled by billionaires is guaranteed […] to preserve the status quo while offering nothing more than the illusion of change.” (p. 281)
“There is no “Reform movement.” The Disrupters never tried to reform public schools. They wanted to disrupt and privatize the public schools that Americans have relied on for generations. They wanted to put public school funding in private hands. They wanted to short-circuit democracy. They wanted to cripple, not improve, the public schools. They wanted to replace a public service with a free market.” (p. 277)
“Our current education policy is madness. It is madness to destroy public education in pursuit of zany libertarian goals. It is madness to use public funds to put young children into religious schools where they will learn religious doctrine instead of science. It is madness to hand public money over to unaccountable entrepreneurs who want to open a school but refuse to be held to high ethical standards or to be held accountable for its finances and its performance. It is madness to ignore nepotism, self-dealing, and conflicts of interest. We sacrifice our future as a nation if we continue on this path of de-professionalizing our schools and turning them over to businessmen, corporate chains, grifters, and well-meaning amateurs. We sacrifice our children and our grandchildren if we continue to allow them to be guinea pigs in experiments whose negative results are clear.” (p. 281)
Ravitch proposes a number of things that billionaires could do that would be more helpful than what they are currently doing. She suggests [I’m quoting but shortening her list, found on page 280] that the billionaires could …
pay their share of taxes to support well-resourced public schools.
open health clinics to serve needy communities and make sure that all families and children have regular medical checkups.
underwrite programs to ensure that all pregnant women have medical care and that all children have nutritious meals each day.
subsidize after-school programs where children get exercise, play, dramatics, and tutoring.
rebuild the dramatics programs and performance spaces in every school.
lobby their state legislatures to fund schools fairly, to reduce class sizes, and to enable every school to have the teachers, teaching assistants, social services, librarians, nurses, counselors, books, and supplies it needs.
create mental health clinics and treatment centers for those addicted to drugs.
They could emulate the innovative public school that basketball star leBron James subsidized in Akron, Ohio.
She also quotes Paymon Rouhanifard, who was a “prominent member of the Disruption establishment [who] denounced standardized testing when he stepped down as superintendent of the Camden, New Jersey, public schools […]. He had served as a high-level official on Joel Klein’s team in New York City […] Upon his arrival of the impoverished Camden district [….] he developed school report cards to rank every school mainly by test scores. But before he left, he abolished the school report cards.” She quotes him directly: “[…] most everybody in this room wouldn’t tolerate what I described for their own children’s school. Mostly affluent, mostly white schools shy away from heavy testing, and as a result, they are literally receiving an extra month of instruction […] The basic rule, what we would want for our own children, should apply to all kids.” (p.271)
“Disrupters have used standardized testing to identify and take over or close schools with low scores, but they disregard standardized testing when it reveals the failure of charters and vouchers. Disrupters no longer claim that charter schools and inexperienced recruits from Teach for America will miraculously raise test scores. After three decades of trying, they have not been successful.
“Nothing that the Disrupters have championed has succeeded unless one counts as ‘success’ closing hundreds, perhaps thousands, of community public schools in low-income neighborhoods. Ths Disrupters have succeeded in demoralizing teachers and reducing the number of people entering the teaching profession. They have enriched entrepreneurs who have opened charter schools or developed shoddy new products and services to sell to schools. They have enhanced the bottom line of large testing corporations. Their fling with the Common Core cost states billions of dollars to implement but had no effect on national or international test scores and outraged many parents, child advocates, lovers of literature, and teachers. “
Fortunately, the resistance to this has been having a fair amount of success, including the massive teacher strikes in state after state. As Ravitch writes (p. 266):
“The teachers taught the nation a lesson.
“But more than that, they taught themselves a lesson. They united, they demanded to be heard, and they got respect. That was something that the Disrupters had denied them for almost twenty years. Teachers learned that in unity there is strength.”