Monday 19 May 2014

Moving files from Dropbox to Mega

Note - I have not actually yet managed to achieve what I set out to do, but below is a good starting point for using the Dropbox and Mega APIs through Python.

My Dropbox 50GB 2 year trial period expires in a couple of weeks' time. Mega however offers 50GB for free with no current time limit.

It makes sense then to migrate from Dropbox to Mega. Unfortunately the number of files I have on Dropbox means that it would be a pain to do this manually. Luckily both services provide APIs, and client libraries exist for Python. It would be nice if one could do this to move all Dropbox files to Mega, maintaining the directory hierarchy.

import mega
import dropbox

m = mega.login(username, password)
d = dropbox.login(username, password)

for f in d.get_files():
    m.upload(f)

Unfortunately, one can't. It's a bit more complicated.

The first step is to head over to the Dropbox website and get some API keys. Go here https://www.dropbox.com/developers/apps, click on Create App and select Dropbox API App.

Choose the following settings:
  • Files and Datastores (What type of data does you app need to store on Dropbox?)
  • No (Can your app be limited to its own folder?)
  • All file types (What type of files does your app need access to?)
You should be taken to a page which contains, among other things, an App key and an App secret. Take note of these.

Install the relevant Python packages:

pip install dropbox
pip install mega

Create a file called dropboxtomega.py and open it in your favourite text editor, Sublime Text. (If this isn't your favourite text editor, give it a try; it soon will be).

The following code loops through all the files in your Dropbox account and saves them to the local folder, perfectly maintaining the directory structure. This isn't what we want to do (if we did, we'd just have used the official Dropbox sync app), but it's a good starting point.

import dropbox
import os
import mega

def recurse_folder(client, path, depth=0):
  folder_metadata = client.metadata(path)
  contents = folder_metadata.get("contents")
  for item in contents:
    if item.get("is_dir"):
      dirname = item.get("path")[1:] # remove leading slash
      print ".." * depth + dirname
      if not os.path.exists(dirname):
        os.makedirs(dirname)
      recurse_folder(client, item.get("path"), depth+1)
          else:
      fpath = item.get("path")
      print ".." * depth + fpath
      f = client.get_file(fpath)
      with open(fpath[1:], 'wb') as out:
        out.write(f.read())

app_key = 'xxxxxxxxxx'

app_secret = 'xxxxxxxxxxxxxxx'

flow = dropbox.client.DropboxOAuth2FlowNoRedirect(app_key, app_secret)

# Have the user sign in and authorize this token
authorize_url = flow.start()
print '1. Go to: ' + authorize_url
print '2. Click "Allow" (you might have to log in first)'
print '3. Copy the authorization code.'

code = raw_input("Enter the authorization code here: ").strip()
access_token, user_id = flow.finish(code)

recurse_folder(client, "/")

If we can save to disk keeping directory hierarchy, we should be able to do the same thing using Mega instead of local storage, right? Right??

Wrong. Unfortunately.

Although the Mega API does provide the functionality to create folders and to save files to specific folders, this doesn't work too well with the library I'm using. Let's leave Dropbox for now and take a look at Mega:

mega = Mega({'verbose':True}) #shows upload progress
m = mega.login("yourname@youremail.com","yourmegapassword")

m.upload("path/to/file.ext")

Looks simple, right? No API keys or access tokens. It Just Works. To create a directory I should be able to do:

m.create_folder("my_folder")

Which works. Then I should also be able to do this:

m.create_folder("my_sub_folder", dest="my_folder")

Which doesn't. It seems to succeed but the folder does not appear. I should also be able to do this:

m.upload("myfile.txt",dest="my_folder")

Which throws a timeout error. Just when things looked like they would be easy.

Although we seem to be having difficulties moving the files and maintaining directory structure, we can still move the files, abandoning our hierarchy. This could be useful if, for example, the script had to run on a machine with less hard drive space available than the total amount of data stored in Dropbox. The recurse function used to upload is as follows. The biggest disadvantage of this is that it requires that None of your files in Dropbox have the same name, even if they are in different directories. It would be trivial to catch exceptions and append a -1, -2, etc to the end of such files, but that's hacky enough to make even me cringe. Note that even though the response from the Dropbox API seems to be a Python file object, it is in fact a custom REST Response object. The easiest way to ensure the data is a the format needed by the Mega API is to save the object to a temporary operating system file and to upload it that. This does add a lot of unnecessary disk IO, and there may well be a better way of converting the REST object to a Python file object.

def recurse_folder(client, path, depth=0):
  folder_metadata = client.metadata(path)
  contents = folder_metadata.get("contents")
  for item in contents:
    if item.get("is_dir"):
      dirname = item.get("path")[1:] # remove leading slash
      print ".." * depth + dirname
      recurse_folder(client, item.get("path"), depth+1)
    else:
      fpath = item.get("path")
      print ".." * depth + fpath
      f = client.get_file(fpath)
      with open("tempfile", 'wb') as out:
        out.write(f.read())
      try:
        fname = fpath.split("/")[-1]
      except:
        fname = fpath
      with open("tempfile") as f:
        m.upload("tempfile", dest_filename=fname, input_file=f)


Finale
I thought it would be simple. It wasn't.

The easy route out. Download the Dropbox sync app (which you probably have already if you've been using Dropbox). Download the Mega sync app. Once all your Dropbox files are synced to a local Dropbox folder, copy them to a local Mega folder, and allow the Mega app to sync back to the cloud.

Pros: fast, easy, likely to work
Cons: You don't get to mess around with Python and APIs

Your  choice.

1 comment:

  1. A solution if you really have a crappy upload speed (20kB/s here and I had 15GB on my Dropbox ...) :
    Rent a DigitalOcean server for a few hours (you can have it free using a code)
    Use Dropbox app to download all your files (very quick, at 20 MB/s)
    Use megatools to upload everything to Mega (a bit longer because it's working file per file)

    I was able to migrate all my data in a couple of hours (DigitalOcean let you pay on a per hour basis) where I would have taken months with my ADSL connection

    ReplyDelete