SCRIPT: Speech Synthesis for Spoken Content Production
How can we support the research of hybrid text-to-speech synthesis technologies in low-resource languages?
SCRIPT is a 3-year research and innovation project looking to develop synthetic voices for low-resourced languages. The BBC is a SCRIPT project partner and is collaborating with the University of Edinburgh's Centre for Speech Technology Research (CSTR), where the project began in January 2017.
The idea for SCRIPT originated at a language technology #newsHACK that we hosted in collaboration with BBC Connected Studio. Our aim in partnering with the CSTR is to support the research of hybrid text-to-speech synthesis technologies in low-resource languages.
The CSTR is researching the possible integration of two methods of producing synthetic voices: unit selection and deep neural network, or parametric text-to-speech. This will combine the broadcast-quality, 'natural'-sounding voice recordings produced through unit selection, driven by deep neutral network technology to enable parametric changes to tone, pitch and speed.
We have decided to focus our effort on the creation of voices in Hausa, Swahili and Bengali - all important languages for the BBC World Service. The Language Technology workstream is interested in the quality of synthesized voices that can be created given the lack of training data in low-resourced languages, and how these voices could be used in audience facing products.