The Today Programme on BBC Radio 4 last week carried an item on Narrative Science, a spin-out from Northernwestern University’s Medill School of Journalism that is building technology to enable computers to automatically write news stories. Its part of a growing trend whereby computer software is being developed to tell stories on behalf of brands and media outlets.
This story starts with data journalism. In its crudest form databases of information are dumped onto the internet and interrogated by communities using crowdsourcing techniques. Wikileaks is the highest profile example of this genre. But it is only the beginning.
Computers are being used to interrogate massive data sets and identify trends that humans might never spot. I’d throughly recommend reading Paul Bradshaw’s Online Journalism Blog if you are interested in exploring this area.
Automated research and editing
My Northumberland Social hackathon experiment used automatic techniques to gather and filter content. Human intervention is required solely to edit and publish content although with some further thought that too could almost certainly be automated.
Percolate, a New York based start-up, helps brands identify relevant conversations that are taking place across the social web. From here it is a short step to understanding how to engage as a brand on these new platforms rather than spamming press releases in Facebook groups as The Economist Lean Back 2.0 blog spotlighted yesterday.
Computers as storytellers
Narrative Science goes a stage beyond these techniques. It applies computer algorithms to report the facts of a story without human intervention. It’s applying its technology to stories that rely on lots of data such as finance and sport. Computers can’t yet convey emotion or pace in a story but they can do a good job of reporting facts.
Summaries of school and college sports events and financial market reports are reported by stitching together results and building a narrative around the data. There are only so many ways that you can comment on the progress of a game of cricket or the movement of a stock.
In a delightful twist of irony if you head to Narrative Science’s web site you’ll find a list of articles about the firm from the mainstream media. For now at least there are no machines involved in reporting developments about the firm.
Automated content generation is already happening in other corners of the web. The BBC is using semantic web technologies as part of its sports reporting and machines will form at least part of its reporting of the London 2012 Olympics stating that it enables journalists to focus on the craft of storytelling.
Web 3.0: the semantic web
Machined media is a topic that Philip Sheldrake has blogged about at length. He claims that as the web becomes a universal media for the exchange of data, the discovery and creation of content can become autonomous. If Web 1.0 was an interlinked documentation system, and Web 2.0 is the social web, then Web 3.0, known as the semantic web, will add context to data on the web so that the web itself, or at least the machines connected to it, can understand what it all means.
This blog post was written by a human.