active

Frank - machine translation of BBC articles

Automatically translating BBC articles into English and aggregating them in one place.

Hypothesis

If editors can easily monitor stories published by the BBC’s languages services then journalists’ reports will reach a wider audience around the world.

The BBC publishes news articles in 43 languages ranging from Swahili to Serbian. Journalists from each language service publish a mix of their own original journalism and articles they have translated from other language services.

But there’s a problem

A language barrier and the sheer volume of stories both mean that editors aren’t always aware of relevant articles published by other language services.

Really good stories which would be of interest to a wider audience can end up being published in just one language.

News Labs is working on solving this problem.

Introducing Frank

Our software engineers made a prototype tool and named it Frank. It is short for lingua franca - a common language used between people who do not share a native language.

Frank translates BBC articles into English then aggregates them on a web-based dashboard.

A BBC journalist looking at the BBC Swahili news website.

The BBC publishes news in more than 40 languages, including Swahili.

A journalist from a language service can select a story from that dashboard for Frank to translate into their own language. The journalist can then edit that translation within the browser.

How we built Frank

We learnt from a previous News Labs prototype, Live Page Translation, which aggregates and machine-translates the content from the short news updates on language services’ live pages.

This time, we used two machine translation services: Amazon Translate and the service from the News Labs collaboration project called the Global Under-Resourced Media Translation (GoURMET).

The Frank prototype calls these services in a React app written in TypeScript.

We used another News Labs project, Depth Finder, to ingest BBC articles and filter them.

We added a filter to weed out stories digital editors told us they didn’t want. For example, editors said that they usually have a disproportionate number of stories about death and destruction but struggle to find original reporting, explainers and inspiring stories.

The filter selects stories journalists have tagged with the World Service user needs “inspire me”, “divert me”, “educate me” and “give me perspective”. It drops articles that journalists have described as being “update me” and “keep me on trend” type stories for the audience.

Next steps

We are testing Frank with digital editors working in the language services.

We will also give the GoURMET team the translations journalists have edited. This data could be used to improve the accuracy of further translations.

We will assess how the tests go and depending on the outcome may add more filters. One filter could highlight stories that have proved particularly popular with the audience. An additional filter could eliminate stories that are translations themselves.

We will then consider whether to put this prototype into production for general use.

Results

  • The prototype is being tested with teams in the newsroom.

Careers

Love data and code?

We're looking for talented developers to join our team.