How computers “grade” essays

Computers cannot understand an essay. They can merely follow a mathematical model or algorithm that the authors hope will work. However, as one of the companies marketing these programs states in a FAQ:

“It is important to note that although PEG software is extremely reliable in terms of producing scores that are comparable to those awarded by human judges, it can be fooled. Computers, like humans, are not perfect.


“PEG {the software} presumes “good faith” essays authored by “motivated” writers. A “good faith” essay is one that reflects the writer’s best efforts to respond to the assignment and the prompt without trickery or deceit. A “motivated” writer is one who genuinely wants to do well and for whom the assignment has some consequence (a grade, a factor in admissions or hiring, etc.).


“Efforts to “spoof” the system by typing in gibberish, repetitive phrases, or off-topic, illogical prose will produce illogical and essentially meaningless results.
“How does PEG evaluate content?   


“Like most automated scoring technologies, PEG, when properly trained, can determine whether a student’s essay is on topic. PEG can identify the presence or absence of key words that give clues to the content. For example, references to the Nina, Pinta, and Santa Maria would lead PEG to the conclusion that the topic was related to the voyage of Christopher Columbus–provided that these keywords were defined prior to the analysis (or were frequently referenced in the training set).  


“However, analyzing the content for “correctness” is a much more complex challenge illustrated by the “Columbus Problem.” Consider the sentence, “Columbus navigated his tiny ships to the shores of Santa Maria.” The sentence, of course, is well framed, grammatically sound, and entirely on topic. It is also incorrect. Without a substantial knowledge base specifically aligned to the question, artificial intelligence (AI) technology will fail to grasp the “meaning” behind the prose. Likewise, evaluating “how well” a student has analyzed a problem or synthesized information from an article or other stimulus is currently beyond the capabilities of today’s state of the art automated scoring technologies.”
So, in sum, computers are in fact not a real replacement for human judgement. If you want to teach students to write well and solve problems involving math, you need small class sizes so that teachers can have enough time to read and reread the entire essay and/or decipher how the student solved the problem, and give sound, professional judgement on how the student demonstrated partial or full understanding, and then decide what path to take to reach a more complete mastery of the topic at hand.

Bill Gates most definitely doesn’t understand this, even though he went to schools where teachers had that sort of mandate.

Thanks to Diane Ravitch for bringing this to my attention.

Published in: on June 28, 2015 at 3:00 pm  Leave a Comment  

The URI to TrackBack this entry is:

RSS feed for comments on this post.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: