Improving neural machine translation by using human journalist edits in real-time model updates.


How can neural machine translation models immediately reflect human corrections to their output in subsequent translations?


MT-Stretch is a research collaboration project with the University of Edinburgh. It aims to create algorithms which can identify and understand human-made edits to machine translated text, which may then be fed back into the machine translation system. The corrections to mistranslations could be re-learnt by the system - increasing its accuracy.

News can be difficult for a machine to translate with new names of people and places appearing regularly. These can be hard for a machine translation model to output correctly. However, for news reporting it is of upmost importance that these details are accurately translated.

We are grounding this project by focussing on the BBC News Bureau in Delhi, which creates content in Hindi, Gujarati, Marathi, Punjabi, Tamil and Telugu. The machine translation models produced will translate between these, without going through an intermediate language as well as translating between these Indian languages and English.

In time the aim is to help news stories move more efficiently around the teams in the Delhi bureau, as well as enable the Indian language teams to more efficiently re-purpose content authored in English.

Research into machine translation is one of the central cores of our Language Technology Workstream. This project is closely aligned with GoURMET.

MT-Stretch project partners


Love data and code?

We'd like to hear from you.