Sunday 7 December 2014

Building a Twitter Bot with Python

South Africa has been facing some horrible load-shedding recently, with different areas experiencing complete blackouts for 2 hour periods up to 3 times a day. Our electricity 'provider', Eskom, publishes schedules about which areas will be without power and when, but these schedules depend on what stage of load-shedding they've decided to hit us with, with stages running from 0 (no load-shedding) to 3 (severe load-shedding). Unfortunately the stages change at short notice and the schedules are completely different depending on which stage we're in. I wanted push notifications whenever the stage changed, so I decided to use Python and Twitter to implement a quick solution.

I thought it would take a few minutes, but the Twitter API has become more complicated since last time I used it and it ended up taking me a couple of hours from idea to final(?) implementation. The bot can be found here: https://twitter.com/eskomstagealert.

Because it only tweets when the stage changes, I enabled tweet notifications for it from my main Twitter account and now in theory I should get a notification for every change. I say in theory because although it passed all my tests with flying colours, there hasn't actually been a stage change since I turned him on.

The process I used to write a Twitter bot is as follows. I assume you're using Ubuntu and Python 2.7, but you should be able to adapt everything fairly trivially if you're not.

1. Create and set up a new Twitter Account
This may sound pretty self-explanatory, but Twitter requires some admin to be used without a human's touch. Twitter no longer allows automatic login through an API using just your username and password. Instead, we'll need to create a "Twitter app" and generate some IDs, tokens, and keys.

First, verify the account by clicking on the confirmation link that Twitter emails to you when you first sign up. Now, go to https://apps.twitter.com/ and click the "Create New App" button. Fill out the name and description and any valid-looking URL for the website (we won't be using it, but it's a required field). 'Read' the Ts & Cs and click "Create your Twitter Application"

By default, the app only has read permissions. We'll be wanting to "write" (i.e., tweet), so we need to modify these. Unfortunately, Twitter demands your phone number to obtain Read & Write permissions, so you'll have to go to https://twitter.com/settings/add_phone and set that up.

Once you're done giving away your freedoms, go back to the Twitter apps page (https://apps.twitter.com/), and click on your app. Now click on the permissions tab, select the "read and write" radio button, and click "Update Settings".

Click on the "Keys and Access Tokens" tab and note where your 'Consumer Key (API Key)' and 'Consumer Secret (API Secret)' are. We'll come back to grab them in a bit. For now, just scroll down the page and click on the "Create my Access Token" button near the bottom. Above the button, you should now see some more info, including and 'Access Token' and an 'Access Token Secret'.

That's all the Twitter setup we need; let's get onto the Python.

2. Pythoning the Twitter Bot
There are a bunch of Python wrappers for the Twitter API. They all do pretty much the same thing, but I chose Twython, as it seemed well established, maintained and documented (and it had the cleverest name). To install simply run pip install twython. If you don't have pip, shame on you. Go install it now. If you're ever going to write another line of Python code in your life, it'll be time very well spent.

The Python is very straight forward. Obviously the Exception handling has room for improvement, but hey this was never meant to be an industry-grade project. Finding the STATUS_URL, and then working out how the numbers related to stages, took a bit of trawling through Eskom's badly-written JavaScript and some use of BurpSuite. (Eskom's main loadshedding page loads the "No Loadshedding" message by default and then updates the value using JavaScript which means that simply parsing the main page was a bit optimistic for my needs.) As we're only storing a single number at any given time, a database seemed overkill, so I use a text file instead. Just make sure that you have write permissions for FILE_PATH from the user that you're going to be running the script as. (If you're not sure, use something like '/home/$USER/last_status.txt'). Create the last_status.txt file manually in the appropriate place with an initial value of 5, as there's no allowance for the file not existing or being blank in the code below (and the 5 means that it'll definitely be different from the current stage, so you'll get to see the first automatic tweet straight away).

import urllib2
from twython import Twython

STATUS_URL = "http://loadshedding.eskom.co.za/LoadShedding/getstatus"
NUMBER_MESSAGE_MAP = {"1": "No Loadshedding :)",
                         "2": "Stage 1 loadshedding active",
                         "3": "Stage 2 loadshedding active",
                         "4": "Stage 3 loadshedding active :("}

FILE_PATH = "/data/eskom/last_status.txt"

def send_tweet(tweet_text):
    APP_KEY = "your_app_key -
Consumer Key (API Key)"
    APP_SECRET = "your_app_secret -
Consumer Secret (API Secret)"
    OAUTH_TOKEN = "your_access_token"
    OAUTH_TOKEN_SECRET = "your_access_token_secret"
    twitter = Twython(APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
    twitter.update_status(status=tweet_text)

def get_html(url):
    response = urllib2.urlopen(url)
    page = response.read()
    return page

def get_previous_key():
    with open(FILE_PATH) as f:
        s = f.read()
    return s[0]

def update_stage_key(new_key):
    with open(FILE_PATH, "w") as f:
        f.write(new_key)

def main():
    try:
        previous_stage_key = get_previous_key()
        current_stage_key = get_html(STATUS_URL)
        current_message = NUMBER_MESSAGE_MAP[current_stage_key]
        if not current_stage_key == previous_stage_key:
            print "updating",current_message
            update_stage_key(current_stage_key)
            send_tweet(current_message)
    except Exception as e:
        print e

main()


3. Set up a Cron job
The script simply checks the current stage against the one it saw most recently. If they don't match, it saves the new one as the most recent one, and tweets a message about it. Therefore we need to run the script regularly in order to receive accurate updates. EskomStageAlert is quite diligent and eager to please, so he checks every minute for a change. To set your bot up for the same interval, do the following.

Run the command "crontab -e" to edit your cron file. Press "2" to edit the file in Nano if prompted (or if you know what you're doing select one of the other options), and append the following to the bottom of the file

* * * * * python /home/$USER/eskom.py >> /home/$USER/eskom_log.txt

The asterisks in the beginning indicate that the task should be run every minute. Then comes our main command which is python followed by the full path to our script (modify yours as necessary if you saved your script somewhere else). The >> is optional, but it'll direct the output of our script to a file so we can check if anything goes wrong. Note that a single > will overwrite the log file every time, while using two >> will append to the file).

And that's that. If all went well, your Twitter bot should be in action. Note that the API doesn't allow you to tweet a status identical to your previous one, so if you get a 403 Error, duplicate tweets could be the cause. If you get other errors, double check that you copied all of the API keys correctly (with no extra spaces or bits missing), and if that doesn't fix the issue then try regenerating them through the Twitter apps page.

To finish off, simply visit your Twitter bot from your main Twitter account, using the Twitter mobile app. Press the "Follow" button and then the grey star next to it (on the Android app in any case. This might differ for other mobile apps). Now you'll get push Twitter notifications whenever your Bot tweets!


* One of the few gripes I have with Ubuntu is their choice to use Nano as a recommended default for cron instead of vim, which is better, but let's not get into text editor flame wars here.