Saturday, 26 July 2014

Pretty Python Progress


Often when I write loops in Python I want to how much progress has been made. A simple way is to put a counter and a print statement:
import time
progress = 0
thingsToProcess = range(543)
for thingToProcess in thingsToProcess:
    progress += 1
    print "%s/%s" % (progress, len(thingsToProcess))
    time.sleep(0.05)
Or in fewer lines:
import time
thingsToProcess = range(543)
for progress, thingToProcess in enumerate(thingsToProcess):
    print "%s/%s" % (progress, len(thingsToProcess))
    time.sleep(0.05)
Which is OK, but it spams the output screen with a line for each iteration. We can improve a little bit by only printing on every Nth iteration:
import time
thingsToProcess = range(543)
for progress, thingToProcess in enumerate(thingsToProcess):
    if progress % 10 == 0:
        print "%s/%s" % (progress, len(thingsToProcess))
    time.sleep(0.05)
But we can do even better with not too much effort. Let's build progress bars and percentage counters! We use ANSI codes to print nice(?) colours in the terminal, and the "\r" special character (carriage return) to print over the same line in the terminal. We also need to "flush" the standard output on each print, or it will get buffered automatically and we'll only see the final line. Note also the comma at the end of the print statement in print_progress - this suppresses the newline character. To use, just call the print_progress function from inside any loop where:
  • You know the index of the current iteration
  • You know how many iterations there will be in total
  • There are no other print statements in the loop
Just change the string argument "colour" ("cyan") to any of the colours in the dictionary defined in print_progress to see the progress in other nice(?) colours. By subtracting 60 from each number in the colour dictionary a different shade of that colour will be printed instead (e.g. try using "32" instead of "92" for Green).
import sys
import time

def print_progress(current, total, colour=""):
    current += 1 # be optimistic so we finish on 100 
    colours = {"":0, "black":90, "red":91, "green":92, "yellow":93, "blue":94, "purple":95, "cyan":96, "white":97}
    COLOUR_START = '\033[%sm' % (colours.get(colour))
    COLOUR_END = '\033[0m'
    percent_float = float(current)/float(total) * 100
    percent = "%.1f" % percent_float
    bar = "|%s>%s|" % ("-" * int(percent_float/4), " " * (25 - int(percent_float/4))) 
    print "\r%s%s / %s - %s%% %s %s" % (COLOUR_START, current, total, percent, bar, COLOUR_END),
    sys.stdout.flush()

thingsToProcess = range(145)
for progress,value in enumerate(thingsToProcess):
    print_progress(progress, len(thingsToProcess), "cyan")
    time.sleep(0.05)

Edit: I updated code, which adds time remaining and uses a Class. Less hacky, more efficient, better. See demo function for example usage. Full listing below.

# Gareth Dwyer, 2014
# A simple progress bar for Python for loops, featuring
#   * Percentage counter
#   * ASCII bar
#   * Time remaining
#   * Customizable additional info (display last data processed)
#   * Customizable colours
#   

import sys
import time
from datetime import datetime, timedelta

def convert_seconds(num_seconds):
    """ convert seconds to days, hours, minutes, and seconds, as appropriate"""
    sec = timedelta(seconds=num_seconds)
    d = datetime(1,1,1) + sec
    return ("%dd %dh %dm %ds" % (d.day-1, d.hour, d.minute, d.second))

def run_demo():
    """ create a ProgressBar and run """
    pb = ProgressBar("cyan")
    data = range(30,56)
    for i,v in enumerate(data):
        time.sleep(0.5)
        pb.print_progress(i, len(data), v)

class ProgressBar:

    def __init__(self, colour="green"):
        """ Create a progress bar and initalise start time of task """
        self.start_time = time.time()
        self.colours = {"":0, "black":90, "red":91, "green":92, "yellow":93, "blue":94, "purple":95, "cyan":96, "white":97}
        self.start_colour = "\033[%sm" % (self.colours.get(colour))
        self.end_colour = "\033[0m"

    def print_progress(self, current, total, additional_info=""):
        """ Call inside for loop, passing current index and total length of iterable """
        if additional_info:
            additional_info = "[%s]" % additional_info
        current += 1 # be optimistic so we finish on 100 
        percent = float(current)/float(total) * 100
        remaining_time = convert_seconds((100 - percent) * (time.time() - self.start_time)/percent)
        percent_string = "%.1f" % percent
        bar = "|%s>%s|" % ("-" * int(percent/4), " " * (25 - int(percent/4))) 
        print "\r%s%s / %s - %s%% %s %s remaining: %s %s" % (self.start_colour, current, total, percent_string, bar, additional_info, remaining_time, self.end_colour),
        sys.stdout.flush()

if __name__ == '__main__':
    run_demo()

Saturday, 19 July 2014

How (not) to impress potential employees

Maybe you're involved in a company that is interested in employing computer science students. Maybe one day you will be. I've just come back from a week-long 'field trip' which had little to do with fields, but involved various companies in Cape Town doing their utmost to impress me and my classmates and to persuade us to apply to work for them. Some rose spectacularly to the occasion, while others ... didn't. I couldn't help but notice some very clearly defined differences between the two extremes, so here's a (relatively) brief how-to on impressing students.

The companies we visited were BSG, Amazon, Korbitec, Centre for High Performance Computing, Bandwidth Barn, Open Box, and KPMG, and each had us in their offices for about half a day. While I won't go in to any too much "naming and shaming", there are some definite yeses and nos about how to host a group of students / potential employees.

No slides
You can't possibly explain what you're all about without PowerPoint, right? Wrong. At two of the companies, we didn't see a single presentation, and it's no coincidence that these are the two companies where the class wasn't bored, playing on laptops, or sending witty messages (largely highly insulting to the company) on the class WhatsApp group. While anyone can give a PowerPoint presentation, it takes someone with some public speaking skills to talk to people informatively without hiding behind (in front of) slides. But even if you only have one person who can do this effectively, make sure they are available when the students arrive.

Definitely no software/tutorial slides
We're comp sci students. We've seen a lot of slides about how to use software and how computers or computing concepts work. This is our vacation, and we're here to hear about your company - not about Scrum methodology or frameworks that were the latest and greatest 5 years ago and which your company still thinks are worth talking about. One company had someone present a slideshow about AngularJS - a slideshow that the presenter happily admitted to having scrounged from the top Google result because he didn't have time to throw anything together himself. He used phrases such as "I'm not quite sure what this variable does, it was declared further up, ummm, I think". 

BSG tried to give us a miniature "workshop", which involved over four hours of slides and incompetent speakers. We were highly amused to discover an article on their website afterwards, claiming that the event had been a complete success. They went so far as to say that: "When asked whether they would like to work at BSG when they graduated, the students at the event were unanimous". And we were even more amused to see that they'd put a photo of the Information Systems class instead of a photo of us.

Don't appear stingy
We're all living off student budgets at the moment. When we open a menu at a restaurant, we automatically scan the price column for potential meals, and then look to the description for confirmation. Money still has a sense of intrigue. One company provided a couple of bottles of champagne, enough pizza to carpet my diggs, a wide variety of drinks which we failed to finish, and yo-yos. We were impressed and the yo-yos were played with during boring visits to other companies. Another company asked for the name-tags they'd given us back again, as they wanted to reuse the safety pins. We started joking in horrible ways about them before we were out of earshot. 

Allow us to interact with your employees
Anyone can make a company seem glamorous for a couple of hours. If we only see one room and two people, we are immediately suspicious about working conditions for the rest of the staff. Once we've heard the (short) introduction, give us food and invite your other staff too. We want to be able to engage in one-on-one casual conversations with people who are working there to get a less biased impression of your company. 

Tell us how much money we stand to make
Money is important to us. Every company without exception offered "a competitive starting salary". This exact phrase was pronounced by presenters, provided on pamphlets, and printed on posters. When asked for a ballpark figure, there were hushed silences, nervous giggles, and a reply of "we can't really talk about that". Rumours get around - we know, or think we know, what you're paying. We may well be wrong. But none of us are going to send in our CVs if the rumours are that you've given up on South African Rands altogether and are instead counting out ground-grown legumes for your employees at the end of each month. We know that starting salaries may differ even within your company - but give us some idea. You ask for our exam marks - imagine if we put on our CVs "above average exam marks, with competitive additional achievements".

Spend five minutes finding out who we are
We're constantly reminded that the worst thing we can do in an interview or cover letter is confuse your company with your competitor. If I walk into a Korbitec interview and tell them how excited I am about working for BSG, it's over. But we had presenters assume we were from UCT, be under the assumption that we were all Information System students, and even had one that gave us a whole lecture on Astronomy and Physics. On the other end of the spectrum, one company had asked our department to send in summaries of our honours projects, and the CEO spoke to some of us individually about what we were working on (while we ate pizza and drank beer). 

Give us nice toys
We've got a lot of pens, and company-branded lip-ice is not really our thing. We're not going to walk around with our keys on your company's lanyards. But again, we're students. You don't need to go all-out and buy us all new laptops, cars, and houses. Hoodies are great; one company gave us high-quality touch-screen styluses; even the sunglasses and yo-yos were used and not just chucked in the nearest bin (sometimes even bins in your offices, though mostly we were polite enough to use the ones outside). 

And finally, avoid clichés "like the plague". Every company "does things a little bit differently", they all "encourage growth in their employees", "think outside the box", and "have a strong employee focus". We know you like to think that you "empower us" to "reach our full potential"; that you are kind of into "viable solutions", "understanding culture", and "leveraging opportunities". We don't care about "sector specialists", "open mindsets" that are "essentially very powerful", and we don't believe that you are "all about people". Your "company vision" is meaningless, and I doubt you could give an acceptable definition of "empathetic" if actually asked about it. Be straightforward with us, we're all intelligent enough to smell bullshit when it's shoved under our noses.

SSH Tunnelling for web access

Today I set up an SSH tunnel for the first time, and I was surprised at how easy it was! Using nothing but a simple SSH command and Firefox, you can route all your web traffic over an SSH connection, ensuring that it is all encrypted, and bypassing petty firewall rules. Completely hypothetically, this could also be used to gain access to a WiFi connection which allows SSH connections but redirects all HTTP requests to a "please sign up with your credit card details to access our slow WiFi at extortionate rates" page (as is the case with many public WiFi hotspots). I hope I need not reassure my readers that this is definitely not, in any way, why I needed an SSH tunnel.

What you need:
  • A computer located anywhere in the world with unfettered access to the internet, a static IP, and which is capable of accepting SSH connections. *
  • A computer which has restricted access to the internet.
  • Mozilla Firefox.
  • PuTTY if the machine from 2) is running Windows.
If the restricted machine is running Linux, simply open a terminal and enter the command:

ssh -D 8080 user@123.456.789.123

where 'user' and the IP address are those for the unrestricted computer. This sets up a dynamic SSH connection which tunnels all traffic sent to port 8080 via SSH to the unrestricted machine. No set up on the remote machine is needed at all!

Now open Firefox and go to Options (or Preferences) -> Network -> Settings. Set to "Manual proxy configuration", fill in the SOCKS Host with "localhost" and the port with "8080". Leave the "HTTP Proxy" field blank. Press OK.


That's it. You should now have full web access through Firefox over the SOCKS proxy via the SSH tunnel!

If your restricted machine is running Windows, then you need PuTTY to make the SSH connection. Put the IP address of your unrestricted machine for "Host name (or IP Address)", then go to Connection -> Data in the tree menu on the left, and put the username for the unrestricted machine in the "Auto-login username" field. Finally go to Connection -> SSH -> Tunnels, put 8080 in the "Source port" field, select the "Dynamic" radio button, and hit the "Add" button. Press "Open" to open the SSH connection to the unrestricted machine. Firefox should now have full web access via SSH.



* It's only $5/month for a digital ocean VPS. These work brilliantly for SSH tunnelling, as they have SSH access set up by default, and Digital Ocean is currently not charging for excess bandwidth. Here's my referral link for your convenience: https://www.digitalocean.com/?refcode=d7616f10aa59
If you use this link, I'll get $25 once you've spent $25 after signing up.