Automatic database backups to Amazon S3 on Heroku

Heroku recently changed over its database backups from using its own bundles system to using the standard pg_dump utility with the PG Backups add-on.

This makes it a snap to download a backup and quickly load it onto your development machine, so you can work off of the same data as your production server. Better yet, you can upload a pg_dump to Amazon S3 (or anywhere else) and then instruct Heroku to restore it, which is a great change, and takes care of a lot of the difficulties in the past with reanimating your backups in the event of trouble.

One thing that Heroku has never made dead easy, though, is automatic backups, which should be a basic part of any well-managed application.

Fortunately, Heroku has a Cron add-on for automated jobs, which combined with the PG Backups add-on, can be used to put together to automatically upload a copy of your database to Amazon S3 daily. Best of all, both the daily Cron add-on and the PG Backups basic add-on are free!

The Heroku Cron add-on will automatically call the cron task in lib/tasks/cron.rake. In addition to the following code, this also requires that you have the Heroku and AWS-S3 gems installed and setup on your application. Your S3 credentials are automatically drawn from AWS-S3, but your Heroku credentials should be added manually to the file. Here is the code that I have in my cron task to automate my backups:

require "heroku"
require "heroku/command"

task :cron => :environment do
  Rake::Task['backups:backup'].invoke
end

namespace :backups do
  desc "create a pg_dump and send to S3"
  task :backup => :environment do
 
    HEROKU_USERNAME = ''
    HEROKU_PASSWORD = ''
    APP_NAME = ''
    BACKUP_BUCKET = ''
    PATH_INSIDE_BUCKET = ''
  
    puts "Backup started @ #{Time.now}"

    heroku = Heroku::Client.new HEROKU_USERNAME, HEROKU_PASSWORD

    puts "Capturing new pg_dump"
    Heroku::Command.run_internal 'pgbackups:capture', ['--expire', '--app', APP_NAME], heroku
   
    puts "Opening S3 connection"
    config = YAML.load(File.open("#{RAILS_ROOT}/config/s3.yml"))[RAILS_ENV]
    AWS::S3::Base.establish_connection!(
      :access_key_id     => config['access_key_id'],
      :secret_access_key => config['secret_access_key']
    )

    begin
      bucket = AWS::S3::Bucket.find(BACKUP_BUCKET)
    rescue AWS::S3::NoSuchBucket
      AWS::S3::Bucket.create(BACKUP_BUCKET)
      bucket = AWS::S3::Bucket.find(BACKUP_BUCKET)
    end

    puts "Opening new pg_dump"
    pg_backup = Heroku::Command::Pgbackups.new(['--app', APP_NAME], heroku)
    local_pg_dump = open(pg_backup.pgbackup_client.get_latest_backup['public_url'])
    puts "Finished opening new pg_dump"

    puts "Uploading to S3 bucket"
    AWS::S3::S3Object.store(Time.now.to_s(:number), local_pg_dump, bucket.name + PATH_INSIDE_BUCKET)

    puts "Backup completed @ #{Time.now}"
  end
 
end

If you're not a fan of AWS-S3 and want to use Fog instead, definitely checkout bakkuappu:

https://github.com/mrich54907/bakkuappu

In writing this, I consulted both bakkuappu as well as this page:

http://metaskills.net/2011/01/03/automating-heroku-pg-backups/

This whole thing is based on another script I had running backing up bundles, which was based heavily on:

http://gist.github.com/451597/6c1945765e4091b73df70835ee8e3be6e963bd77