Bots in News Labs

In 2016 news fashion nerdery, bots are everywhere. Some news organisations are investing heavily in automated writing, while others are delivering their content through an automated message-like User Interface. Bloomberg is placing a bigger bet that automation can streamline its news gathering and production processes, creating a team dedicated to the task. Facebook, Slack and Telegram opened up powerful APIs on which to build all kinds of new delivery methods. Even The Economist spotted the trend.

economist screenshot

Leaving aside the comparison with Daleks, the Economist piece legends this photo “App exterminators.”

And they’re not too far from the truth, since bots offer news organisations the ability to piggy-back a platform’s user base to push its content. What if you didn’t have to leave the Facebook Messenger to read BBC News? What if we could push all the EU Referendum results to your Twitter timeline, as they happen?

News Labs have been building a few of those. We’ve built three bots, serving three popular bot platforms: Facebook, Twitter, and Telegram.


Facebook

First stop on our bots journey is Lei He’s Facebook bot, in partnership with BBC Mundo. Taking advantage of the newly released Messenger API, this bot communicates with users by automated messages.

Why Messenger?

We chose to use the Messenger platform after our colleagues in World Service identified it as a priority - a great new opportunity from one of the world’s biggest social media platforms. Spanish is one of the most widely spoken languages in the world, so BBC Mundo and its enthusiastic social media team were an obvious starting point.

Once we found out that Messenger would enable bots through their newly-released API, we started asking if we could use that functionality to reach a wide audience who might not come to the BBC’s sites or who might find it more convenient to get their news inside Messenger.

What does it do?

The bot lets the users read and subscribe to BBC Mundo top stories within the Messenger app and get automated replies. Those who subscribe will get two daily updates.

Apart from the automated messages, it also allows BBC Mundo editors to manually push breaking news to the bot subscribers. This project also provides an analytics dashboard, so we can track the metrics to assess the success of the bot.

Here’s a screenshot of the dashboard:

facebook bot dashboard

How does it work?

A user sends a message on their mobile/web Messenger app and the message is routed via Facebook’s internal APIs to our bot service for response. Our bot service sends the response to a special callback url Facebook provided to us.

The diagram below shows how it works:

facebook bot diagram

What’s next?

  1. We’d like to use machine learning techniques and train our bot using Wit.ai bot engine, which is an API that turns natural language (speech or messages) into actionable data. By implementing this, we can hope to have a more personalised messages and a better and more effective communication with the users.
  2. We’ll possibly expand the bot to other BBC World Service languages if BBC Mundo Messenger bot proves to be a success.
  3. One of the thing which we hypothesise people will ask our bot for is the news around a certain topic. We’d like to utilise the topic tagging system BBC News already has in place to pull up the most recent or popular /trending articles about a certain topic — for instance, if someone said ‘Tell me about David Cameron’, we could pull up a popular, recent article about him.

A challenge?

The challenge we’re facing is to give personality to the bot and to construct the messages with the right tone so that they feel natural and human.


Telegram

Next up is another collaboration with the BBC World Service, the result of a brainstorming session between Trushar Barot, Mobile Editor for Digital Development, and Jacqui Maher of News Labs. How do we reach audiences in places where the BBC is censored or blocked, whether by government or other means?

Why Telegram?

Trushar noticed that Telegram was the most popular messaging app in Uzbekistan, a country where bbcuzbek.com is blocked. When Telegram released its Bot API, Jacqui and Trushar realized there was an opportunity to distribute our coverage in an automated way, right on the platform the Uzbek audience was already using.

What does it do?

There are two main components to this project: one is the bots themselves, which integrate with Telegram through its API and handle messages to and from users of the platform. Creating a Telegram Bot that responds to commands beyond basic static text replies requires writing a fair amount of code. So, we built a journalist-friendly web-based admin that our editors can use to configure bots with an array of custom commands and features.

Each bot is associated with one of our language services, which in turn have a series of RSS feeds largely arranged around sections and topics. Our Bot Web Admin supports several types of commands:

  • Subscribe: send a daily digest of articles to users who sign up
  • Unsubscribe: allow users to manage which digests they receive
  • News on Demand: send the latest articles immediately
  • Simple text: send a message to users, changeable in real-time

How does it work?

The web admin is a Ruby on Rails application, populated with all of our language services and their associated feeds upfront. Editors and journalists initially create a bot by chatting with Telegram’s @BotFather, issuing the “/newbot” command and receiving an authentication token. This token is key to automating the service - once obtained, we can start configuring the bot in our web admin.

telegram bot admin

The bot services, written in Go for speed and efficiency, read from the same database where the web admin stores configuration settings. In addition to connecting to Telegram’s API and receiving messages from users, the bots also have to parse RSS feeds and format stories for display in the telegram app. To support places that don’t use the Gregorian calendar, the bots have the ability to convert and localize article publication dates, such as the Persian (Solar Hijri) calendar.

Of course, providing a service with subscription features requires a means of storing who has signed up to which feed, sending fully formatted articles to each of these recipient lists, and functionality allowing users to unsubscribe and manage their subscriptions. Keeping in mind the potential risks involved in reaching users who might be in a country which doesn’t like its citizens reading external news sources, we ensured the minimum amount of information about Telegram users was stored and encrypted.

Launching initially with a @BBCUzbekBot, our Uzbek audience can now get the top stories on demand or subscribe for daily digests of stories by topic.

What were the main challenges?

  • Security
  • Ease of configuration - coming up with things like “command types” to encapsulate everything involved in a “subscribe” request, for example
  • Robust, fast, behind-the-scenes bot services that pick up configuration changes in real time
  • Scalability and reliability
  • Developing against a moving target (Telegram API changes along with internal feed changes)
  • Integration into the BBC Production pipeline, specifically the deployment platform, processes and tools.

This project also furthered News Labs know-how in moving from a rough prototype to a production-ready application, integrated into the BBC deployment tools and processes. The experience gained here will facilitate handover of other prototypes and pilots in the future.


Twitter

And of course, we’re finishing this review with Twitter. Developers have had access to a suite of tools for a while, and Jacqui Maher joined the EU referendum effort with a results delivery service based on those.

Background

On the night of the referendum results, animated graphics were generated for television display, one for each local area triggered by the results data feed. The server generating these animations was also setup to take a still image of each animation and save it to a shared folder.

In addition to other related work in the run up to and day of the vote, the Visual Journalism team committed to sharing each image with a short summary to the @BBCReferendum twitter account. This would be time consuming to do manually, and seemed an obvious place to try out some automation… which is where News Labs got involved.

Responding to a direct editorial requirement, News Labs was able to write a bot that processed each image and automatically constructed tweets for each result. Being a news-focused labs team, we weren’t as constrained by other referendum deadlines, plus we saw some room for experimentation. This turned into a wonderful collaboration between Paul Sargeant of the Visual Journalism Team and Jacqui Maher of News Labs.

Once the results started coming in on the night of Thursday, 23rd of June, Jacqui Maher’s laptop said the outcome for each area aloud - in an appropriately accented voice, where possible - before posting the tweet.

How it worked

First, as the results would be reported from individual “counting areas”, a list we could compile ahead of time, we initialized the database (redis) with a hash for each, keyed by the GSSID (a unique identifier of the location). These datasets contained variations of the area names, an image processing status (default: ‘waiting’), and a blank result value.

Once we were set up on the night of the vote, we started up two scripts - one to process the images, the other to construct the tweet and post it.

Using a standard filewatcher library on the shared image folder, each new image was copied immediately to a “working” directory and then:

  • Cropped to reduce unnecessary blank space around the charts
  • Resized to the most appropriate size for tweets, inline and maximized
  • Received the BBC logo over the bottom center
  • Saved in a “tweetable” images directory upon success.

The database kept track of the process as well, moving from “waiting” to “processing” to “processed” once complete. Errors along the way were caught to prevent invalid images from moving through the queue.

Next it was time to compose some tweets and get the news out to social media. The second script ran in a similar manner, using a filewatcher on the “tweetable” images directory. We intentionally split up the work to allow modularity in the bot, with each piece specializing in a bit of the bigger picture.

Once images were resized, cropped and logo’d, we had to construct a message summarizing the result. Agreeing upon a file naming convention ahead of time and running through a series of dress rehearsals of possible scenarios dramatically facilitated this. Each image filename was parsed to fill in the right values according to a predefined set of validation checks and outcome rules.

The naming convention included the GSSID, Result (Leave, Remain, or Tie), area name, and image generation timestamp.

Challenges

Any code that generates sentences, especially for such a hugely important vote, could potentially go awry, so we did everything possible to avoid misinforming social media :)

Before attempting to construct a tweet, the counting area’s image status was verified to be “processed” in addition to it existing in the “tweetable” directory. But that’s not all! Covering our bases, we checked:

  • That the counting area existed in the database, looking up the right length name by the GSSID: short, medium or long variations should all have existed ahead of time. If the name couldn’t be found in the database, we simply fell back to the value parsed from the image filename.
  • That the area’s result value was blank - in other words, this was the first reported result from the location. If it wasn’t, however, and the incoming result has changed - e.g., from LEAVE to REMAIN, several alarms would go off, starting with a loud robotic voice alerting everyone in the room. The area’s status would get set to ‘flipped’, its previously reported result stored as a “past_result”, the incoming result replacing the earlier one, and most importantly, nothing would be tweeted. This scenario was rather unlikely and would require human intervention, someone who can sort it out reasonably and make a decision.
  • If the incoming result was the same as previously reported - e.g., it was REMAIN and still is REMAIN - we’d update the area’s status to “updated” as this scenario usually meant a more accurate count was issued from one of our sources. For accountability’s sake, we stored the previous result and incoming result just like the “flipped” scenario, to make it very clear that this was not the first report of a result for an area.
  • Finally, if this is indeed the first result reported, we’d set its status to “new”

Automatic Tweeting

Here’s how we constructed each tweet message:

  • If the area is in Northern Ireland, prepend the message with the text: “Local Result - “
  • If this area’s result is updated (but not changed), add “Updated result: “ to the message
  • What is the actual vote result?
    • “T”: “$area_name is tied.”
    • “L”: “$area_name votes to Leave.”
    • “R”: “$area_name votes to Remain.”
  • Add a link to the full results on the site: “ Full results: $results_url”
  • Finally, append the “#EURef” hashtag to the message.

We’re now ready to tweet! Our bot would connect to twitter’s REST API and then:

  • Announce, vocally, the area’s result
    • If the area is in Scotland, use the Scottish voice (“Fiona”)
    • If the area is in Northern Ireland, use the Irish voice (“Moira”)
    • If the area is in England, use the British English voice (“Daniel”)
    • If the area is in Wales, use an alternative British English voice (“Serena”) - there were no Welsh-accented english-speaking voices, unfortunately.
  • Tweet it!
  • Update the area’s status:
    • First result tweeted = “tweeted”
    • Updated result tweeted = “tweeted-update”

The Bot In Action!

euref bot screenshot


Categories:

Tags: