Multilingual Solutions

Published: 14 May 2016

This workstream has been calibrated to include wider machine learning implementations.

Aims

Emerging language technology and machine learning methods can be applied to facilitate BBC News' workflows and content exchanges across multiple languages. This has been a long-running workstream covering several multilingual projects, listed below.

Background

Broadcasting in more than 40 languages, the BBC has a uniquely multilingual newsroom. To enable journalists across different teams and languages to operate and interact with diverse global audiences seamlessly, the team started off by exploring applications of language technology for multilingual solutions.

These involved:

Speech-to-Text (STT)
You'll find this in many telephone services around the world; any time a voice asks you to say "one" or "yes" or "no" into your phone, the engine at the other end decodes what you've just said.
Machine Translation (MT)
This refers to translating from one human language into another. A lot of people will have interacted with services involving MT, such as Google Translate or Bing Translator.
Text-to-Speech (TTS)
This will be familiar from e-book readers, sat-navs, tannoy announcements, and — of course — your smartphone's personal assistant, which uses both STT and TTS. A voice engine processes your text and converts it into synthetic speech.
Speech-to-Speech Translation
Being able to communicate verbally in any language. The focus is moving beyond merely linking STT and TTS technologies back-to-back, and achieving translation organically through neural networks instead.

Outside of the core workstream technologies above, we also have an interest in other areas including named entity recognition, topic detection, automated fact checking, summarisation, extraction.

Our scope

Multilingual Solutions workstream primarily seeks to apply language technologies to the BBC's non-English-language news services.

We explore opportunities to make multilingual journalism more efficient (freeing up time for journalism and curation, rather than tasks such as transcribing a long interview).
We want to give our audiences a new experience of following the news (not everybody is comfortable in English... why should they miss out on news stories?)
Our method is to demonstrate the art of the possible through prototypes (try it out!)
We track the state of the art, so that BBC and partners can take advantage when the tech reaches readiness.

Our approach

Collaborate with universities and research groups (eg University of Edinburgh, UCL, Universities of Alicante, Amsterdam, Cambridge, Alan Turing Institute)
Collaborate with international news broadcasters (eg Deutsche Welle) and industry bodies (eg the European Broadcasting Union)
Identify and match the use cases and requirements with the technology
Experiment with what is available and find out what works