## The VAM Equation Explained So We Can All Understand It (Not!)

At long last, Jason Kamras has let the equation out of the bag, providing Sarah Bax with a two-page description of the secret algorithm by which teachers are judged:

As far as I can tell, looking at this piece of obfuscation, the only variable that is at all spelled out is the one dealing with the proportion of days that the student attended school during the prior year. Everything else appears to be deliberately vague. The only thing I like is that they were honest enough to include an “error term” – but how big is it? Is it also a ‘vector’ composed of a number of other values, or just a simple number? How is it calculated? Is it larger in some schools, grades and subjects than in others, or does one error fit all?

In case you were wondering, a mathematical vector is a variable or constant that is composed of other variables. The simplest example I can think of would be the coordinates of a point, which you probably learned to say as a pair of numbers written in this format: (x, y). However, you could also add other terms for such things as the temperature (t), color (c), mass (m), and so on, as well as vertical elevation from the page (z)- and get a vector with at least 6 parts: (x, y, t, c, m, z). How one would deal with this sextuplet, one would have to decide – there are different rules for different situations. And clearly, the authors of this two-page exercise in unreadability aren’t saying how one deals with those vectors or variables.

As far as I can tell, the entire algorithm is all still sealed safely away in a black box so that no-one outside of Mathematica can possibly understand it or challenge it.

Published in: on March 2, 2011 at 4:43 pm  Comments (20)

1. Um… wow. This is completely meaningless without a LOT more information.

Like

• You got that right. It’s almost like Euler saying to Diderot, “(a + b^n)/n = x, donc Dieu existe, repondez.”

Like

• Euler, or, Ferris Bueller?

Like

2. While I don’t have the math proficiency to comment, it seems that their inability/unwillingness to explain the formula in plain English suggests nothing other than creating a fog of misrepresentation.

It’s (a very narrow, management dominated) human resource management couched in pseudo-science and gobbledygook.

Like

3. I wonder how many millions went into this masterpiece?

Like

4. Sends this to Val Straus, let the public decide or “Frank” if this makes any sense.

Like

• Why don’t you try doing so? Valerie Strauss has never responded to any email or phone call from me.

Like

• Isolation feeds her insight.

Like

5. Reads to me like a recipe for how to make a robot. But then, I’m not “Mathimatica”ly inclined.

Like

• If it is such a recipe, then it’s missing all of the parts.

Like

6. This is just a standard linear regression. The error term is just the residual number where the equation doesn’t fit the dependent variable. What is vague about the equation? It specifically describes what variables were included in the school and student vectors.

Like

• The problem with this linear regression is that we still have no details on any of the values of any of the variables and terms!

Like

• I agree that they absolutely should reveal what the coefficients are. Maybe they do elsewhere in the document — you haven’t posted the whole thing. The values of the variables obviously vary by student and school. Free lunch eligibility, language proficiency, learning disability — I assume that those are relatively well-defined within DCPS. Mathematica got their data from somewhere. You don’t know those variables for each kid in your classroom?

Again, the dimensionality of the error term is the same as that of the dependent variable. When you are running a linear regression on a single dependent variable it makes no sense to refer to a vector of error terms.

Like

• No, I posted the ENTIRE document. They reveal NOTHING.

Like

• Many of the variables you speak of are “yes-no” type variables. How exactly do you use those to apply some sort of regression?

On Sat, Mar 5, 2011 at 5:53 PM, Guy Brandenburg wrote: > No, I posted the ENTIRE document. They reveal NOTHING. >

Like

7. They are “dummy variables.” The coefficient on them is how much the dependent variable should change given a switch in the yes/no. For example, if the dependent variable is test score and the coefficient on the variable “eligible for free lunch” is -5, then one would expect a child eligible for free lunch to score 5 points lower than one who isn’t, holding everything else constant (and ignoring all the philosophical problems of pretending to have two children exactly alike except for free lunch eligibility…)

In addition to revealing the coefficients of the regression, they should reveal its fit (R^2) and summary statistics about the population (mean, standard deviation of each of the variables). That would be de rigeur for publishing in any peer-reviewed journal.

Like

• Thanks for the additional information. When and if they finally release a technical paper, it will have been about two full years since the plan was implemented and teachers have been deemed effective or otherwise, based on a secret black box that no one is allowed to see the contents of. Would teachers ever be permitted to grade our own students, for years, based on an algorithm whose details was held secret from the students and parents? I don’t think so.

Like

8. NYTImes has a good take on this today:
Evaluating New York Teachers, Perhaps the Numbers Do Lie
http://www.nytimes.com/2011/03/07/education/07winerip.html?_r=1

he calculation for Ms. Isaacson’s 3.69 predicted score is even more daunting. It is based on 32 variables — including whether a student was “retained in grade before pretest year” and whether a student is “new to city in pretest or post-test year.”
Those 32 variables are plugged into a statistical model that looks like one of those equations that in “Good Will Hunting” only Matt Damon was capable of solving.
The process appears transparent, but it is clear as mud, even for smart lay people like teachers, principals and — I hesitate to say this — journalists.

Ms. Isaacson may have two Ivy League degrees, but she is lost. “I find this impossible to understand,” she said.

In plain English, Ms. Isaacson’s best guess about what the department is trying to tell her is: Even though 65 of her 66 students scored proficient on the state test, more of her 3s should have been 4s.

But that is only a guess.

Moreover, as the city indicates on the data reports, there is a large margin of error. So Ms. Isaacson’s 7th percentile could actually be as low as zero or as high as the 52nd percentile — a score that could have earned her tenure.

Like

9. If we are ever allowed to see the complete formula, here is one of the things I want someone who can perform the task (certainly not me) to check. It has to do with chronic absenteeism and the fact that is has a large rotating component to it. The questions I assume to be relevant are: When chronic absenteeism results in a teacher never having the same kids in the same class period on the same day, at what point can that teacher no longer be held accountable for all kids making adequate progress? Is there a general threshold for percentage of absentees that invalidates the VAM results? I’m sure there are better ways to construct the question, but I think you all get the idea.

Like

10. In the following link they state that “All variables are binary with the exceptions of absence rate, suspension rate, and age.” http://www.mathematica-mpr.com/publications/pdfs/education/value-added_pittsburgh.pdf

Like