Grading the Computer's Sports Reporting

Steve Lohr wrote a piece in the NYTimes last Saturday about Narrative Science, a startup that's turning data into intelligible narratives.

Screen shot 2011-09-13 at 8.52.55 PMIf you haven't read about Narrative Science yet, I'll repeat what I just said lest you presume I speak metaphorically: the idea is to turn data, not into a chart, not into a graph, not into choices on an actionable dashboard, but into prose.

Lohr's article samples what he says is computer-generated reporting about a college football game.* Lohr's chosen sample reads pretty much like wire sports reporting, and the folks quoted in his article agree that the writing seems pretty good, even human.

I'm a tougher grader.

Individual sentences in the reporting are written well. They parse grammatically, and the diction is colloquial.

The paragraphs are pithy. The longest is four sentences. Considered by itself, it works:

"A one-yard touchdown run by Montee Ball capped off a two-play, 42-yard drive and extended Wisconsin’s lead to 51-3. The drive took 42 seconds. The key play on the drive was a 41-yard pass from Wilson to Bradie Ewing. A punt return gave the Badgers good starting field position at UNLV’s 42-yard line."

But you know the writing's not human because, paragraph to paragraph, opportunities for semantic transitions are missed.

For example, in the reporting preceding the paragraph quoted just above, we read that, "The Badgers started [a prior] drive at UNLV’s 28-yard line thanks to a Jared Abbrederis punt return." So the good punt return that led to the two-play drive that extended Wisconsin's lead to 51-3 - that was actually a second punt return worthy of note.

The human reporter wouldn't miss calling that out. "Another good punt return gave the Badgers good starting field position at UNLV's 42-yard line," she might write. Or, better yet, realizing she had already stated the touchdown drive was 42 yards long, she would have eliminated the redundancy at the end of the paragraph, not insulted our intelligence by telling us, in effect, that the 42 yard drive started on the 42 yard line. "The drive was yet again set up by a good punt return," she might write.

I'm a tougher grader, but I'm not a skeptic. The Narrative Science tool is obviously useful, and if as good as it appears it already is, it will get better.

Computer generated narrative could be the very tool to put an end to advertising's stranglehold on every new media disruption that, to date, seems eventually to surrender to it. If the marketers' don't get ahold of Narrative Science first!

*Caution: the Big Ten Network site on which the report appears credits the story to " staff."

Image: André Breton, "Automatic Writing."

blog comments powered by Disqus