Yet this is what many would like us to do. It is the bottom of a slippery slope that began when it was realized that different readers come up with different evaluations of the same written work–and do so, even when both readers are teachers or experts. The first step by those this alarmed was development of grading rubrics, guidelines meant to make sure everyone “read” a paper in the same way. This forced a new concentration on the squiggles, of course, and a move away from consideration of the dynamic. It also presupposed something of a Platonic form for the written “essay” (whatever that is), an ideal that all essays can aim for. All of this is nonsense, developed simply for ease of evaluation of assessment. That is, it was developed so that writing “success” could be boiled down to a number that could be judged against another.
Todd Farley, whose book Making the Grade: My Misadventures in the Standardized Testing Industry details just how fraught with errors (to say the least) American educational assessment is, wrote recently about the newest phase of the mania for machine grading of written work for The Huffington Post. He shows just how limited a view of writing is being proposed for assessment:
Provocative thoughts in those essays? The automated scoring programs failed to recognize them. Factual inaccuracies? The scoring engines didn’t realize they were there. Witty asides? Over the scoring engines’ heads they flew. Clichés on top of clichés? Unbothered by them the scoring systems were. A catchy turn-of-phrase? Not caught. A joke about a nitwit? Not laughed at. Irony or subtlety? Not seen. Emotion or repetition, depth or simplicity, sentiment or stupidity? Nope, the automated essay scoring engines missed ’em all. Humanity? Please.
And that’s just the start of it. The machine has no way of knowing if the essay is doing the job of communication that is its putative goal. It can only assess if the squiggles conform to a particular set of patterns set for the page.
In Verbal Behavior, B. F. Skinner attempted to develop a system for considering speech (and, by extension, writing) as a dynamic instead of as a ‘thing.’ Even in the 1950s, he was able to recognize that we were missing something when we evaluated language use simply through sets of formal rules. We’ve come a long way since then–or so we like to believe. Why, then, do we constantly regress to a view of language that, even fifty years ago, was recognized as insufficient?
We can only assess writing (and speech) acts through their impact and the resulting dialogues or actions. We cannot successfully assess them through examination of only a part of them. Through, in the case of writing, squiggles.