Language Technology

BBC News Labs started a stream of Language Technology projects in 2014 in order to scale our storytelling globally


How might we efficiently scale BBC News' structured storytelling across a growing set of languages globally?

BBC News Labs is exploring ways of integrating Language Technology into the News production process. We experiment with innovative approaches to delivering News to our multilingual global audiences.

Within News Labs, Language Technology is a long-running workstream that encompasses a number of separate sub-projects.

What is Language Technology?

Language Technology looks at all aspects of spoken and written human languages. It is all around us - we use it frequently and in a growing number of places, often unexpected, and sometimes without realising it.

This workstream concentrates on the following three core components which can be arranged in a pipeline:

  • Speech-to-Text (STT) - you'll find this in many telephone services around the world: Any time a voice asks you to say "one" or "yes" or "no" into your phone the engine at the other end decodes what you've just said.
  • Machine Translation (MT) - this refers to translating from one human language into another. A lot of people will have tried Google or Bing at some point or another. These translations are based on fragments of human translations, which is what makes the MT sometimes sound really good. If not enough original human translations are available, the MT results can be a bit ... well ...garbled.
  • Text-to-Speech (TTS) - you know this from e-book readers, satnavs, tannoy announcements... and of course your smartphone's personal assistant which uses both STT and TTS. A voice engine decodes your text and converts it into synthetic speech.

Outside of the above core workstream technologies we also have an interest in other areas of innovation such as Named Entity Recognition, Topic Detection, Automated Fact Checking, and Question Answering.

What's the purpose of this workstream?

This workstream primarily seeks to apply Language Technologies to the BBC's non-English speaking news services.

  • We explore opportunities to make multilingual journalism more efficient (free up time for journalists to curate the News - not spend it on arduous tasks!)
  • We want to give our audiences a new experience of following the news (not everybody is comfortable in English... why should they miss out on news stories?)
  • Our method is to demonstrate the art of the possible through prototypes (save time on discussions and disconnected views - try it out!)
  • We track the state of the art, so that BBC and partners can take advantage when the tech reaches readiness.

What's our approach?

  • Collaborate with universities and research groups (e.g. University of Edinburgh, UCL, Cambridge, Alan Turing Institute...)
  • Collaborate with international news broadcasters (e.g. Deutsche Welle) and industry bodies (e.g. the European Broadcasting Union)
  • Experiment with what's already available and find out what works and what doesn't work

Current Language Technology Projects

More about our Language Tech work


Love data and code?

We'd like to hear from you.