Trees, machines and local journalism
Wouldn’t it be great if a BBC news story - let’s say about tree planting - was tailored to your location? The challenge for us journalists is just like that of a seamstress fitting an outfit - individual stories have to be painstakingly crafted.
Our latest experiments with Semi-Automatic Local Content (Salco) show we are now able to cut the cloth or data at scale. This summer we published dozens of versions of a story about tree planting, all in one day.
It all came about after we introduced the concept of natural language generation to the BBC’s England newsroom.
Dan Wainwright, a senior journalist in the Birmingham-based team’s data unit, was one of the first newsroom journalists at the BBC to use the new technology. We were impressed with what he came up with.
Building on our experience of using Salco to create stories about NHS waiting times and local election results, Dan worked on stories based on a dataset of tree planting statistics. His colleague, Rob England, obtained the data and worked on a longer story explaining the topic on the BBC news website.
Each local story explained how many trees had been planted with government funding in an individual local authority district since 2010. The long form piece highlighted the scale of the challenge the UK faces to meet carbon reduction targets.
The creative process
After an introduction to the process from News Labs’ David Caswell, Dan set about building a model of the data that would feed into an article template crafted using software licensed from Aberdeen-based company Arria.
This workflow produced a story for each local authority district in England, with one summary story for all of London.
Each of the stories contained relevant local statistics that were also put in a national and regional context. The local versions of the story linked to the longform article about the UK trend.
While many news stories do contain some local figures to provide context and colour, Dan explained how this can leave a little to be desired, like an outfit straight off the shop rack.
“We often provide localised figures in our stories but often these will end up focusing on a few places, whereas this project gave us the chance to use the local, more personal, aspect as a way into the story for the audience,” he said.
Publishing at scale
The process was not without its challenges. Stories had to be tagged with a relevant location identifier in order to appear on the BBC’s online local topic pages.
As part of the Salco process, Labbers built a tool to preview stories and the related tags before they were sent to the BBC’s online news content management system for publication.
The logistics involved in publishing such a high volume of localised stories in one day was also a challenge, with the output of stories on the BBC publishing system more typical of a large live news event.
Cutting to the shape of local radio
In another first, the Salco process was also used to create over 40 unique radio briefs, one for each of the BBC’s local stations in England. These station specific documents contained statistics tailored to each station’s reach, in order to help broadcast teams present the story with a local angle.
When you start to mix radio and online, long and short-form journalism all on the same story, that’s when things get really exciting.
We are now in a position where we could use the same technology for other topics where data can tell audiences how a story is locally relevant to them.
“Natural Language Generation systems like Salco can unlock rich and relevant stories contained in public datasets for everyone across the UK. These local stories, told in a language, style and tone that make them accessible to local audiences, can include more people in the data-driven debates that were previously open only to those who knew where to find the data,” David Caswell said.
What else could we do? Well let’s just say our attention has been caught by the brewing UK general election.
We feel we are on to something, if not quite a journalist’s sewing machine.