active

Live Page Translation

Aggregating and auto-translating live page content

Hypothesis

Can we use machine translation to help journalists work more effectively?

Putting together the perfect live page is a tough task.

Every day, our journalists monitor newswires and check coverage from around the BBC and beyond. They scan social media and keep in touch with contacts.

From all that information, they have to quickly synthesise and curate the most relevant information for their audiences, keeping coverage accessible and interesting.

Multiple stories scrolled through on a live page

Live pages give regular news updates, often linking to longer form content.

It is a difficult, high-pressure role, where speed and accuracy are key and it is further complicated by the multilingual nature of global news.

A BBC Hausa journalist might be monitoring news in Arabic, Turkish, Somali and Swahili to provide the breadth expected by their audiences. A BBC Russian journalist is likely to be looking at content in Azeri, Turkish and beyond.

As you can imagine, it is possible for good, relevant stories to be missed or lost because of the need for translation.

By shadowing journalists we learnt that a lot of time was taken up with a manual machine translation process. Journalists have to navigate to content before copying and pasting text into free online translation tools.

Can automated machine translation help?

With the steady development of machine translation, we were hopeful that it would be possible to embed the technology deeper into our journalists’ ways of working.

First we tried injecting auto-translated content from other language services into the “drafts” column of our content management system. Our theory was that a journalist would see it, pick up the content and publish it.

Our content management system with a draft post auto-translated from Farsi

Our live page content management system with a draft post auto-translated from Farsi

That didn’t work out. The translation quality wasn’t high enough to put straight in front of audiences, and we were cluttering the CMS with only semi-relevant content.

An internal newswire?

Journalists told us that some of the translated content was interesting. They told us too that the translation quality was good enough for them to get the gist of what the story was about, even if it wasn’t accurate enough to put in front of audiences.

We took a step back and decided to build a free standing prototype that ingested, auto-translated and aggregated all our live page content.

We tapped into the pipes that send content running around the BBC, and brought it together in one feed.

Journalists could then browse a feed of content, and decide for themselves what to explore and put in front of audiences.

We worked with journalists as we went along, refining the features to allow them to jump to the original language post, to filter by language and to include or exclude sport.

User scrolls through feed of different translated stories

An early version of the live page translation prototype

The Hausa journalist mentioned above? She can now filter Arabic, Turkish, Somali, Swahili updates from the past 6 hours.

She can get the gist of the story at a glance, and then decide whether she wants to further explore and put it in front of her audience.

Results

  • Initial analytics show a growing group of regular users who have told us that the prototype makes their lives simpler.
  • We are planning a set of pilots around the World Service to further quantify the benefits to journalists and we hope, their audiences.

Careers

Love data and code?

We'd like to hear from you.