Matching Video Content from the World Service
Can we automatically detect video contents shared on BBC Social Media channels which originated from a shared source even if it is edited or cropped?
The BBC World Service produces video content in a wide variety of languages. Some of the content we share on our social media channels is developed centrally in a team called DigiHub and is intended to be translated and reshared by individual language services. Currently, DigiHub producers find it really difficult to track where their content is being shared, how popular it is, and whether their content has wide appeal or only certain services are picking it up.
To solve this problem, we're building a prototype tracker tool. It monitors both the BBC’s internal video management tool where DigiHub videos are stored and the various BBC World Service YouTube channels. Our prototype attempts to detect where video content has been reshared even if it has been edited or language-specific subtitles have been added.
The tracker is completely automatic. It produces a daily report on new publications, providing th DigiHub team with an overview of where their content is used and how it is performing. This allows them to examine audience engagement.
How does it work?
We couldn't simply compare the videos frame by frame since:
- videos may have been edited
- localised subtitles or graphics may have been overlaid
- service-specific identifiers may have been appended to the start or end of the video.
Through innovative use of a "digital fingerprint", the tracker can identify the source of original content even if it has been changed. Changes sometimes include cropping the image, adding large captions, or re-editing. A useful by-product of the tool is that it can help producers find alternative versions of videos within the database.
The digital fingerprint is based on a neural network based on the X3D architecture which has been trained to take a video and extract an "embedding" of the video based on what is happening in the video. For example, two videos which both contain someone dancing will have closer embeddings but videos which contain people engaged in different activities will have embeddings which differ more.
By using the k-nearest neighbours algorithm we could create an index of videos stored in the BBC's internal video management system and match videos shared on social media against that index. We could then use more traditional perceptual hashing methods to reject matches which contain the same activities but are different.
The tracker currently works only with video on YouTube feeds, but News Labs hopes to roll it out to other social media platforms.
We are also looking at extending the "fingerprint" technique to cover written news articles which are shared in multiple languages.
- Being extended to support matching articles between BBC language services