Joel Klein: As Excellent As He Says He Is? Part I

I and many others spent a good deal of time last year documenting the real legacy of Michelle Rhee. This is important work: Rhee occupies an outsized place in the current debate about education “reform,” largely based on claims of her own success, both as a teacher and as Chancellor of Washington, D.C.’s schools.Thanks to the close scrutiny of Gary BrandenburgBob SomerbyMatt DiCarloDana GoldsteinDiane RavitchUSA Today, and others, we now know the true story: Rhee was never a miracle worker. She was, at best, an average new teacher (meaning she had a long way to go) and a mediocre large-city superintendent when judged by student achievement (when judged by other criteria, she was clearly a trainwreck).It’s important to get this on the record, because the anti-teacher and anti-union “reforms” Rhee implemented in D.C. – the very ones she wants to impose on the rest of the country – did nothing to affect large-scale changes in educational outcomes. Rhee’s argument for “reform” is, in fact, undercut by her own history.

I say that it’s time to start applying this same level of examination to other prominent members of the corporate “reform” movement. When they make claims of big successes, those claims ought to be vetted very carefully: after all, why should we listen to what they have to say about holding educators accountable if they aren’t held to account themelves?

Which brings us to Joel Klein.

An Article By A NYC Charter School Teacher for “PolicyMic”

I submitted this article about my experiences re: standardized testing and test-based evaluations in NYC schools to PolicyMic this afternoon.  I hope that it resonates with many of you!  Please feel free to read, comment, and forward as you wish.

And, in honor of Mr. Wellstone (July 21, 1944 – October 25, 2002), who reminded us that laborers and teachers are one in the same, “Stand up. Keep fighting.”

Allison LaFave

Romney Loves Teachers: What Teacher Evaluations and Tests Really Mean for American Teachers

During Monday’s final presidential debate, Bob Schieffer spurred a collective American chuckle when he cut off Romney’s long-winded brown-nosing with the knee-slapper, “I think we all love teachers…”

I’d love to believe Mr. Schieffer, but as someone who hails from a family of public school teachers and spent last year teaching third grade in a New York City charter school, I have to say, “Bob. You’re adorable. But America’s teachers haven’t felt loved in quite some time.”

Last spring, my principal corralled our school’s third grade teaching team around a kidney-bean shaped table and apologetically explained that we needed to sign forms acknowledging the weight of our students’ test scores on our end-of-year evaluations. Ultimately, our students’ math and ELA scores would comprise as much as 40% of our annual rating.

Now, I don’t know a single educator who outright opposes the idea of fair evaluations and/or some level of teacher accountability. But as I sat quietly in that little red plastic chair, a voice in me cried:

“You want to evaluate me? Great. No problem.

“But let’s also evaluate the misaligned (or nonexistent) curriculum I was given to plan for my classes.”

“Let’s evaluate the number of chairs huddled around single desks, because there are more students in the room than there were last year, and the copy machine, the one that never works.

“Let’s evaluate the number of students with IEPs that aren’t being adequately serviced, and the number of English Language Learner students sitting voiceless in the back of the room, because they have yet to be admitted into nonexistent ELL classes.

“Let’s evaluate the employers who are smugly underpaying/underemploying my students’ parents or guardians, forcing them to work multiple jobs, likely without ever securing benefits for themselves or for their families. Or the number of students who have lost parents or loved ones due to gang violence, substance abuse, or the labyrinth that is our failing criminal justice system. Or the number of my students who didn’t eat dinner last night.

“Let’s evaluate how many hours of sleep I got last night, because I was not afforded adequate prep time during my 10 or 11 hour day in the building, or how many times I’ve skipped out on doctor’s appointments and family events to be here for my students.

“And, finally, let’s evaluate my motivations for being here — because it sure as hell isn’t for the money.”

Last week, Deborah Kenny wrote an op-ed piece decrying the heavy influence of test scores on teacher evaluations. Kenny rightfully claimed that the practice “undermines principals and is demeaning to teachers” and leaves little room for innovative teaching and learning. She went on to say that test-based evaluations inhibit the “culture of trust” between principals and teachers and “discourage the smartest, most talented people from entering the profession.”

While I agree that test-based evaluations are inherently flawed (when was the last time our politicians, Democrats or Republicans, truly analyzed aPearson test?), I am baffled by Kenny’s ultimate argument. It seems that Kenny bashes test-based evaluations because … wait for it … they make it harder for her to fire teachers she doesn’t like – specifically a teacher whose students performed “exceptionally well” on the state exam.

Teachers aren’t statistics, but they also aren’t part of some school-wide homecoming court. Administrators shouldn’t cast votes for the teachers they like or dislike. They should work to support all teachers who act in the best interest of students.

Ms. Kenny also takes a not-so-subtle jab at teachers’ unions, attacking evil tenured teachers in America, who are clearly exploiting their glamorous roles as K-12 educators. However, unions don’t grant tenure; PRINCIPALS grant tenure. And, moreover, Ms. Kenny, like nearly all charter school administrators in America, likely prohibits her teachers from joining their local union.

As someone who has worked in a non-union school, I can tell Ms. Kenny what violates trust between teachers and administrators. Knowing that you can be fired for your personality.  Knowing that there is a fresh crop of well-intentioned, starry-eyed Teach for America kids who can take your place in the time it takes to make a phone call. Knowing that you will be scorned for using your allotted sick days and guilted into working through lunch, during prep time, and hours after the final school bell rings.

I encourage our presidential candidates (and all Americans) to listen to the voices of practicing teachers, who are so often talked about and around during national education debates.

Says Kelly G., a third grade teacher in Brooklyn:

“These teacher evaluations are complex. I honestly used to think that a teacher could indeed be evaluated and held accountable using test scores. And then I started teaching at school that didn’t allow me to do the kind of teaching I thought needed to be done in order to develop intelligent children. There’s nothing quite like having your teaching micromanaged and then being told it was your fault the kids didn’t achieve exemplary scores on the state exam.

“My kids are capable of so much already. Come in and look at their writing. Listen to their discussions. Watch them solve math problems. Their tests scores will not reflect their growth from the school year. A one shot assessment does not give a good picture of student achievement. Have you read those exams? Have you been in the room during testing? Test anxiety vomiting is a real thing in the third grade. Too bad they don’t evaluate me on sick child comforting and vomit clean up. I’m sure my scores on those evaluations would be proficient.”

In popular media, teachers are cast as heroes or villains. They are either lazy, money-grubbing, ne’er-do-wells or Jaime Escalante, the “teacher savior” of the acclaimed film Stand and Deliver.

The truth is, as in most professions, the majority of teachers lie somewhere in the middle of this spectrum. Such romanticized notions of teaching make great stories, but that’s just it; they are stories that too often exaggerate and obscure the truth. Jaime Escalante spent years preparing his students for the AP Calculus exam, not a few inspired semesters. Does that mean that he was an inadequate teacher during the years he spent honing his craft and teaching foundational math concepts to his students? How would Escalante have been rated under the New York City evaluation system?

In his research paper entitled “Effects of Inequality and Poverty vs. Teachers and Schooling on America’s Youth,” David C. Berliner (Regents’ Professor Emeritus in The Mary Lou Fulton Teachers College of Arizona State University) finds that “Outside-of-school factors are three times more powerful in affecting student achievement than are the inside-the-school factors.”

Consequently, he concludes, “The best way to improve America’s schools is through jobs that provide families living wages. Other programs…offer some help for students from poor families. But in the end, it is inequality in income and the poverty that accompanies such inequality that matters most for education.”

America’s education system is in crisis; of this, we can be sure. But let’s stop blaming the dentists for their patients’ cavities.

More Value Added Comparisons

Someone who professes to understand Value-Added scores better than me claims that my graphs for NYC are meaningless because the scores for 2007 were inflated; he claimed that the overall year-to-year and year-to-career value-added correlation coefficients are much higher than what I found — thus, VA is really useful, just not my particular graphs..

Taking this objection seriously, I decided to leave out SY 0607, and compare SY 0506 to SY 0708. Same exact teachers, same exact subjects and grade levels, same exact schools, obviously different (but quite similar) kids.

Here is the scatterplot of what I found. Again, I asked Excel to calculate a line of best fit, and it drew it. Notice that the r-squared correlation value is about 0.05 — seriously LOW. Notice also that this scatterplot is basically a blob again, again a classic example of one variable showing very little correlation with another. (West Virginia’s map has a much more defined shape!) In any case, there are lots (hundreds? thousands?) of teachers with positive VA scores in the first year and negative VA scores the third year, and vice-versa. Only an easily countable handful of teachers have scores of +0.2 or better both years, or worse than -0.2 both years. Out of all of the thousands of teachers. And I bet those are all accidents as well.

So, in other words, I find, as did Gary Rubenstein, that there is extremely little correlation between two things that should be, you would think, very close to a perfect 1.00 correlation. (In the real world, of course, you almost never get a 1.00 correlation between any real entities or quantities. However, when you are talking about the scores of teachers who have been teaching IN THE SAME SCHOOL, THE SAME SUBJECT, THE SAME GRADE LEVEL for three straight years, then you would think that their performances would be rather similar all three years. If anything, they would normally get better unless they had suffered some sort of physically or mentally debilitating injury or illness (often from old age and the incredible amount of stress). In particular, a lot of teachers will admit to you that they absolutely sucked at teaching during their first year, but that they then figured out a lot of those errors and tried not to make the same ones the next year, so they really improved, or else they quit. But these folks didn’t quit. These are at the very least three-year veterans, which in DC would make them eligibility for department or grade level chair at their school as a result of seniority alone, since so many of the older teachers have quit or retired, and the turnover and attrition over the last few years among the newest hires in our school system is probably unprecedented in the history of education. (Perhaps not, but it’s a subject I’d like to pursue.)


Finally, while I admit that I exaggerated a bit (for effect) when I said that the shapes of these graphs, and the very low computed values for the r-squared coefficient of linear correlation, made value-added about as predictive as numerology. I thought about that particular exaggeration and wondered how serious it was. So, even though I have participated in a fairly large number of courses on calculating probabilities and distribution, it’s always a bit fraught with error: Have we counted all of the possibilities? Have we left any out? Have we double-counted any of them? Is there a much better, faster, or less error-prone method hidden right around the corner?


So I decided to see whether, in fact, the number of letters in the teachers’ names had any correlation with their Value Added scores. (I thought it was possible, tho not very likely.) I discovered that Excel found the r-squared constant was about 0.000000. That is zero correlation, my friends. Here is one such scatterplot:

The vertical axis, which goes up the middle, is the number of letter in the teachers’ first name times the number of letters in their last name as listed in the spreadsheet. The horizontal axis, which is at the bottom of the page, is their 2005-2006 value-added score, which can be either negative (theoretically bad) or positive (supposedly good). To me, it sort of looks like bush that hasn’t been pruned in several years – a classic case of no correlation at all.

I asked Excel to draw and calculate the line of best fit. It’s the green, nearly-horizontal line near the center of the graph. Notice the r-squared value: 6E-05, which for all of you innumerates out there, means 0.00006, which is seriously smaller (three orders of magnitude smaller) than 0.05; i.e., one-thousandth as big.

Notice that I’m only using r-squared. Someone objected that i should use just r. If you want, take the square root of all of the correlations I had my computer calculate, and you’ll get r. Compare and contrast.

So, in any case, I definitely did exaggerate.

Where the data came from

I neglected to give the source for the data for my last two posts. It’s at the website for what looks like a NYC radio or TV station:–2007-2010-nyc-teacher-performance-data#doereports


if you prefer it shorter.

I will warn you that some of the spreadsheets are quite large.

BTW, I just now did a graph showing how well New York City does at predicting the value-added scores of its teachers for school year 2007-2008. The answer seems to be, not very well. Here is the scatter plot:

The correlation is, again, close to zero, even though NYC’s department of assessment and numerology has done their best to try to get it right. In fact, even though the line of best fit doesn’t fit very well, you notice that it slopes downwards to the right. That means that with kids who are predicted to improve relative to the previous year, teachers’ value-added scores are, in general, lower than predicted; whereas with kids who are predicted to do worse than the previous year, teachers’ value-added scores are, in general, a tad higher than predicted.

Not ready for prime time. And not ready to be used to base hiring and firing and bonus decisions on.

Gary Rubenstein is Right: No correlation on Value Added Scores in NYC

One of the things that experimental scientists really should do is to try to replicate each other’s results to see if they are correct or not. I have begun doing that with the value-added scores awarded to teachers in New York City, and I find that I generally agree with the results obtained by Gary Rubenstein.

What I did is looked at the value-added scores, in percentiles, that were “awarded” to thousands of New York City public school teachers in school years 05-06, 06-07, and 07-08. I found that there is essentially no correlation between the scores of the exact same teacher from year to year. The r-squared coefficients are on the order of 0.08 to 0.09 – about as close to random as you can ever get in real life.

Here are my two graphs for the night:

I actually had Excel draw the line of regression, but it’s a joke: an r-squared correlation coefficient of 0.0877 means, as I said, that there is extremely little correlation between what any teacher got in school year 05-06 and what they got in SY 06-07. In the same school. With very similar kids. Teaching the same subject.

And, a similar graph comparing teachers’ scores for school year 06-07 with their scores for 07-08:

So, one year, a teacher might be around the 90th percentile. The next year, she might be around the 10th percentile. Or the other way around. Did the teacher suddenly get stupendously better (or worse)? I doubt it. By the time they are adults, most people are pretty consistent. But not according to this graph. In fact, if somebody is in the 90th to 100th percentile in school year 2006/07, then the probability that they would remain in the same 90th-to-100th-percentile bracket is roughly 1 in 4. If they are in the 0th to 10th percentile in 2006-2007, the chances that they would remain in the same bracket the following year is about 7%!!

What this shows is that using value-added scores to determine if someone should keep their job or get a bonus or a demotion is absolutely insane.

Welcome to the world of the Super-Rich — and to the other world, where the rest of us live

Valerie Strauss has an interesting piece today, written by a middle-class parent who had the opportunity to go to a meeting of some hyper-rich people in New York City having to do with setting up yet another extremely expensive private school in Manhattan for their own privileged children.

The writer draws a number of contrasts with the education of the vast majority of NYC schoolchildren, who attend the public schools.

Here is the URL:

And an analysis of New York City Public Schools: “Doing Less With More”

I didn’t write the following, but I wanted to be sure people got a chance to read about this study. Thanks to Robert Bligh for bringing it to my attention.  Here goes:


Doing Less With More?

Taking a Second Look at New York City Charter Schools

National Education Policy Center – Boulder, CO – Jan. 27, 2011

New study finds NYC charters benefiting from resources but not producing better student test scores than traditional public schools.

Advocates for charter schools have pointed to New York City as an exemplar of how charters can show better results than traditional public schools. Charter advocates have also stated that these schools are able to do more with lesser amounts of funding.

But both of these claims are not correct, according to a new study that closely examines funding and charter school students. “Adding Up the Spending: Fiscal Disparities and Philanthropy among New York City Charter Schools”, a study by Rutgers University professor Bruce Baker and doctoral student Richard Ferris, was published today by the National Education Policy Center (NEPC) at the University of Colorado at Boulder. The study points out that any meaningful understanding of public resources for New York City charters is highly dependent on three factors:

• Does the charter serve students with greater or lesser needs? The City’s charters disproportionately serve lower percentages of poor and English-learner students, who require more resources.

• Are the schools (charter and traditional public) that are compared serving the same grade levels? Charters overwhelmingly serve elementary aged students, and traditional public schools serving those same grades typically have fewer resources than schools serving upper grades.

• Does the Board of Education provide a facility? About half of the City’s charters are given a public facility. Once the first two factors are considered, the study finds that charter schools not housed in Board of Education facilities receive $517 less in public funding than do non-charters. However, charter schools housed in BOE facilities receive significantly more resources ($2,200 on average more per pupil). But that’s not the end of the story.The authors ask one additional question: Does the charter receive substantial resources from private donors? They examine audited annual financial reports and IRS tax filings and they discover that the best-endowed charters in the City receive additional resources amounting annually to more than$10,000 per pupil in private funding.

According to lead researcher Bruce Baker, “Finding little truth to the test score claims or the spending claims does not, and should not, end discussions of what we can learn from these New   charter schools, but it does point to the hypocrisy and emptiness of arguments by charter advocates that additional resources would do little to help traditional public schools.Such arguments are particularly troubling in NYC where high-spending charters far outspend nearby traditional public schools. Equitable and adequate resources do matter, but there appear to be a considerable number of charters schools in NYC doing less with more.”

Find “Adding Up the Spending: Fiscal Disparities and Philanthropy among New York City Charter Schools” by Bruce Baker and Richard Ferris on the web at:

The mission of the National Education Policy Center is to produce and disseminate high-quality, peer reviewed research to inform education policy discussions. We are guided by the belief that the democratic governance of public education is strengthened when policies are based on sound evidence.

For more information on NEPC, please visit

This research brief was made possible in part by the support of the Great Lakes Center for Education Research and Practice.

CONTACT: Bruce Baker or William Mathis 732-932-7496 ext. 8232  or 802-383-0058


