USBBackupScript

From RdiffBackupWiki

Jump to: navigation, search

Contents

Overview

This is a relatively complete python script to perform a series of rdiff-backups to a USB hard disk. I use it for backing up different disks on a few Xen servers, where LVM is used to create disks for each virtual machine, as well as on other standalone machines.

Note that this has only been run/tested under Ubuntu. It should work on other linuxes, but maybe you could add a comment to the page (or your fixes) with your findings.

Basic process:

  1. Put rdiff-backup-usb.py into /usr/local/bin or somewhere and make it executable
  2. Put rdiff-backup-usb.ini into /etc/backup
  3. Edit /etc/backup/rdiff-backup-usb.ini for your system - see the Configuration section below
  4. Prepare your backup disk(s). This involves:
    1. Formatting them
    2. Creating a file called BACKUP_OK in the root directory
    3. Creating a backup folder in the root directory
  5. Create the mountpoints you're using (/mnt/backup_usb and /mnt/backup_lvm)
  6. Add an fstab entry so that it can be mounted by a command like mount /mnt/backup_usb
  7. Add udev rules if you want it to automatically start when the disk is connected (see the Udev section below)
  8. Run it in dryrun mode, then for real (see the Running section)

Configuration

Basically, read the rdiff-backup-usb.ini file and customise it to your site. Don't delete options from the file, because the script will probably break. Add a new section for each backup set, and remember to set the general.backup_count to the right number when you're done. Paths for all includes/exludes are combined with the mount path (for LVM) and the source path for each backup.

If you need to change the arguments to rdiff-backup or the email text, edit the .py file.

Running

When you run the script, it will do the following things (assuming you haven't changed the config):

  • Load the config file from /etc/backup/rdiff-backup-usb.ini
  • Check for any other instances of the script that are running (in case one got stuck/took too long). If it finds one it will log and email.
  • Mount the USB disk at /mnt/backup_usb). If this fails, its not an error (so you can run from cron) but it will return a result code of 1
  • Check that a file BACKUP_OK is present in the root directory. If this fails, its not an error but it will return a result code of 1
  • Run each backup defined in the config file, in numeric order:
 * Create a directory for the backup name if it doesn't exist - eg. /mnt/backup_usb/backup/<name>
 * Create an LVM snapshot of the volume and mount it (if lvm=yes is set)
 * Run rdiff-backup
 * Unmount the LVM snapshot and remove it (lvm backups only)
  • Check the free space remaining on the USB disk
  • Unmount the USB disk
  • Produce a report for each backup
  • Append the report to /var/log/backup.log
  • Email the full report as an attachment to the admin, with the summary in the main message body.

If it encounters errors running a single backup it will continue with the remaining one. It also tries hard to clean up after itself (removing snapshots, unmounting things, etc). And it won't run again if it didn't exit cleanly (it uses a lock file).

You can specify some command-line options:

 -h print out some help
 -d dryrun - this will go through the paces without doing anything dangerous. It will mount the USB disk, but it won't run rdiff-backup OR create/mount/remove any LVM snapshots
 -c specify an alternative config file to use
 -f ignore existence of the lock file (/var/lock/rdiff-backup-usb) and run anyway
 -v display verbose messages for debugging

UDev

If you want your backups to automatically run when you connect a disk we can trigger the script from udev. This is probably the most Ubuntu-specific bit, but there is great documentation on udev rules at http://www.reactivated.net/writing_udev_rules.html if you get stuck.

Create a file called 70-rdiff-backup-usb.rules in /etc/udev/rules.d. The contents are a single line:

  SUBSYSTEM=="block", ACTION=="add", ENV{ID_FS_LABEL}=="BACKUP_*", SYMLINK+="rdiff_backup_usbdisk", RUN+="/usr/local/bin/rdiff-backup-usb.py &"

When a disk is connected with a filesystem label starting with "BACKUP_", we create a symlink to it at /dev/rdiff_backup_usbdisk, and run the backup script. Our fstab entry will then look like this (for an ext2 disk):

  /dev/rdiff_backup_usbdisk   /mnt/backup_usb   ext2   noauto   0   0

It doesn't matter whether the usb disk shows up as sdc, sdd, sde, or anything - it will always have a symlink at /dev/rdiff_backup_usbdisk. And it handles a series of disks (BACKUP_01, BACKUP_02, etc) in the same way. You can set filesystem labels using e2label (ext2/ext3) or xfs_admin (xfs).

The only problem with this trick is restoring or otherwise mounting the disk (when you don't want a backup) on the same system. Maybe someone could suggest how to filter it by USB port, or start the backup after 5 minutes unless XYZ happens, or ...

If you don't want the auto-start you can also use this udev script to consistently get device names for putting into /etc/fstab (remove the RUN clause), or check out /dev/disk/by-* for similar ways to do it.

Important

Don't blame me if your disks get deleted or the backups don't work, the code is all there for you to review and change yourself! You can reach me with questions or problems at robert.coup@onetrackmind.co.nz, or just change the source here.

The Source

rdiff-backup-usb.py

#!/usr/bin/env python

# rdiff-backup-usb.py
# Robert Coup, One Track Mind
# July 2006

"""
A script for doing backups using rdiff-backup. It is designed to be run
from a udev usb-disk-arrived event.
"""

DEFAULT_CONFIG_PATH = "/etc/backup/rdiff-backup-usb.ini"
DEFAULT_RDIFFBACKUP_ARGS = [
        "--exclude-device-files",
        "--exclude-sockets",
        "--print-statistics",
        "--preserve-numerical-ids"
        ]

EMAIL_BODY = """
The backup started at %s is now complete.

You may now safely turn off then disconnect the external hard
disk.  To start the next backup, just re-attach the external
hard disk.

Here is the summary report for the backup run - see the attached log
for more details.

-----------------------
%s
-----------------------
USB Disk Space:
%s
"""

#imports
import os
import os.path
import time
import sys
import ConfigParser
import getopt
import traceback
import email
import email.MIMEText
import email.MIMEMultipart
import smtplib
import atexit

# globals
config = {}
actionLog = []
curLog = None
verbose = False
dryrun = False

class LvmError(Exception):
        pass
class MountError(Exception):
        pass
class RdiffBackupError(Exception):
        pass

class LogSection:
        def __init__(self, sectionName):
                self.name = sectionName
                self.startTime = time.time()
                self.log = []
                self.success = True
                self.endTime = 0
                self.error = None

class BackupReport:
        def __init__(self):
                self.success = False
                self.log = ""
                self.email = ""


def debug(msg):
        if (verbose):
                log(msg)

def info(msg):
        log(msg)
        if (not verbose):
                print msg

def log(msg):
        global curLog
        curLog.log.append(str(msg))
        if (verbose):
                print msg

def logCmd(cmd, doOnDry=False):
        if (dryrun and not doOnDry):
                log("SKIPPING running command (dryrun): " + cmd)
                return None
        (exitStatus, output) = runCmd(cmd)
        log(output)
        return exitStatus

def infoCmd(cmd, doOnDry=False):
        if (dryrun and not doOnDry):
                info("SKIPPING running command (dryrun): " + cmd)
                return None
        (exitStatus, output) = runCmd(cmd)
        info(output)
        return exitStatus

def runCmd(cmd):
        app = os.popen(cmd + " 2>&1", 'r')
        output = app.read()
        return (app.close(), output)

def logSetSection(newSectionName):
        global actionLog, curLog
        if (curLog != None):
                curLog.endTime = time.time()
        nl = LogSection(newSectionName)
        actionLog.append(nl)
        curLog = nl

def logReport(report):
        lf = open(config["general.log_file"], "a")
        lf.write("%s - New Backup Run\nSummary:\n" % time.strftime('%c %Z', time.localtime(time.time())) )
        lf.write(report.summary)
        lf.write("\n Full log:\n")
        lf.write(report.log)
        lf.write("\n")
        lf.close()

def emailReport(report):
        msg = email.MIMEMultipart.MIMEMultipart()
        msg["To"] = config["email.to"]
        msg["From"] = config["email.from"]
        if (report.success):
                msg["Subject"] = config["email.subject"] + " - SUCCESS"
        else:
                msg["Subject"] = config["email.subject"] + " - FAILURE"
        if (dryrun):
                msg["Subject"] += " *** DRY RUN ONLY ***"
        msg.preamble = "Mime message\n"
        msg.epilogue = ""

        msgBody = email.MIMEText.MIMEText(report.email)
        msg.attach(msgBody)

        logAttachment = email.MIMEText.MIMEText(report.log)
        logAttachment.add_header("Content-disposition", "attachment", filename="backup_log.txt")
        msg.attach(logAttachment)

        smtp = smtplib.SMTP(config["email.smtp_server"])
        smtp.sendmail(config["email.from"], config["email.to"], msg.as_string())
        smtp.quit()

def createReport(diskFreeReport):
        global curLog
        curLog.endTime = time.time()

        r = BackupReport()
        full = []
        summary = []

        allStart = actionLog[0].startTime
        allEnd = curLog.endTime;
        if (dryrun):
                summary.append("*** DRY RUN ONLY ***")
        summary.append("Started at: %s" % time.strftime('%c %Z', time.localtime(allStart)))
        summary.append("Ended at: %s" % time.strftime('%c %Z', time.localtime(allEnd)))
        summary.append("Total time: %d minutes" % ((allEnd - allStart) / 60))

        r.success = True
        for sec in actionLog:
                r.success = r.success and sec.success
                full.append("\n----------- %s ------------" % sec.name)
                full.append("  Start: %s" % time.strftime('%c %Z', time.localtime(sec.startTime)))
                full.append("  End: %s" % time.strftime('%c %Z', time.localtime(sec.endTime)))
                full.append("  Successful: %s" % sec.success)
                if (sec.error != None):
                        full.append("  Error: %s" % sec.error)
                full.append("  -----------------------")
                full.extend(sec.log)

                summary.append("   %s: %s" % (sec.name, sec.success))

        summary.append("Overall success: %s" % r.success)

        r.log = "\n".join(full)
        r.summary = "\n".join(summary)
        r.email = EMAIL_BODY % (time.strftime('%c %Z', time.localtime(allStart)), r.summary, diskFreeReport)

        return r

def addArgList(configKey, baseArgument):
        al = []
        if (config.has_key(configKey) and len(config[configKey]) > 0):
                for item in config[configKey].split(" "):
                        al.append(baseArgument + item)
        return al


def processBackup(index):
        idx = str(index) + "."
        name = config[idx+"name"]
        logSetSection("%d - %s" % (index, name))
        info("Beginning backup %d: %s ..." % (index, name))

        basePath = config[idx+"source"]
        destDir = config["general.disk_mountpoint"] + config["general.backup_path"] + name

        lvm = (config[idx+"lvm"].lower() == "yes")
        if (lvm):
                basePath = config["general.lvm_mountpoint"] + config[idx+"source"]

        # build up the command line
        cmdArgs = DEFAULT_RDIFFBACKUP_ARGS[:]

        cmdArgs.extend(addArgList(idx+"includes", "--include " + basePath))
        cmdArgs.extend(addArgList(idx+"excludes", "--exclude " + basePath))
        cmdArgs.extend(addArgList("general.includes", "--include " + basePath))
        cmdArgs.extend(addArgList("general.excludes", "--exclude " + basePath))

        cmdArgs.append(basePath)
        cmdArgs.append(destDir)

        # create destination directory (if it doesn't exist)
        if (not os.path.isdir(destDir)):
                os.mkdir(destDir)

        if (lvm):
                # create a snapshot
                cmdLine = "/sbin/lvcreate -L%s -s -n %s /dev/%s/%s" % (config["general.lvm_snapshotsize"], config["general.lvm_snapshotname"], config[idx+"lvm_vg"], config[idx+"lvm_lv"])
                log("Creating LVM snapshot: %s" % cmdLine)
                time.sleep(3)
                res = logCmd(cmdLine)
                if (res != None):
                        raise LvmError("Creating snapshot: %d" % res)

        try:
                if (lvm):
                        # mount the snapshot
                        if (config.has_key(idx+"mountoptions")):
                                mountOpts = "," + config[idx+"mountoptions"]
                        else:
                                mountOpts = ""
                        cmdLine = "/bin/mount -s /dev/%s/%s %s -oro%s" % (config[idx+"lvm_vg"], config["general.lvm_snapshotname"], config["general.lvm_mountpoint"], mountOpts)
                        log("Mounting LVM snapshot: %s" % cmdLine)
                        res = logCmd(cmdLine)
                        if (res != None):
                                raise MountError("Mounting snapshot: %d" % res)

                try:
                        # run rdiff-backup
                        cmdLine = "%s %s" % (config["general.rdiffbackup_bin"], " ".join(cmdArgs))
                        log("Running rdiff-backup: " + cmdLine)
                        res = infoCmd(cmdLine)
                        if (res != None):
                                raise RdiffBackupError("Result code = %d" % res)

                finally:
                        if (lvm):
                                # unmount the snapshot
                                cmdLine = "/bin/umount %s" % config["general.lvm_mountpoint"]
                                log("Un-mounting LVM snapshot: %s" % cmdLine)
                                res = logCmd(cmdLine)
                                if (res != None):
                                        raise MountError("Un-mounting snapshot: %d" % res)

        finally:
                if (lvm):
                        # delete the snapshot
                        cmdLine = "/sbin/lvremove -f /dev/%s/%s" % (config[idx+"lvm_vg"], config["general.lvm_snapshotname"])
                        log("Deleting LVM snapshot: %s" % cmdLine)
                        time.sleep(3)
                        res = logCmd(cmdLine)
                        if (res != None):
                                raise LvmError("Deleting snapshot: %d" % res)


        info("Backup completed successfully.")

def checkMagicFile():
        """     Checks that the disk has the magic file in the root """
        return os.path.exists(config["general.disk_mountpoint"] + config["general.access_file"])

def mountDisk():
        cmdLine = "/bin/mount %s" % config["general.disk_mountpoint"]
        log("Mounting: %s: %s" % (config["general.disk_mountpoint"], cmdLine))
        res = logCmd(cmdLine, True)
        if (res != None):
                log("Mount error: %s" % res)
        return (res == None)

def unmountDisk():
        cmdLine = "/bin/umount %s" % config["general.disk_mountpoint"]
        log("Un-mounting %s: %s" % (config["general.disk_mountpoint"], cmdLine))
        res = logCmd(cmdLine, True)
        if (res != None):
                log("Un-mount error: %s" % res)
        return (res == None)

def loadConfig(file, config={}):
        config = config.copy()
        cp = ConfigParser.ConfigParser()
        cp.read(file)
        for sec in cp.sections():
                name = sec.lower()
                for opt in cp.options(sec):
                        config[name + "." + opt.lower()] = cp.get(sec, opt).strip()
        return config

def setAppLock(ignore):
        lock = config["general.lock_file"]
        if (os.path.exists(lock)):
                info("Lock file %s exists!" % lock)
                if (not ignore):
                        return False
                else:
                        info("ignoring existing lockfile...")
        else:
                file(lock, "w").close()

        atexit.register(lambda:os.remove(lock))
        return True

def usage():
        print """
rdiff-backup-usb
  usage: rdiff-backup-usb.py [-v] [--dryrun] [--config config_file]
    -v             verbose
    --config,-c    specify an alternative config file (default is %s)
    --dryrun,-d    do a dry run without running lvm or rdiff-backup stuff
    --force,-f     ignore an existing lockfile and run anyway
""" % DEFAULT_CONFIG_PATH

def main():
        global config, verbose, dryrun, startTime
        logSetSection("begin")

        try:
                opts, args = getopt.getopt(sys.argv[1:], "dfhc:v", ["dryrun", "force", "help", "config="])
        except getopt.GetoptError:
                usage()
                sys.exit(2)
        configFile = DEFAULT_CONFIG_PATH
        ignoreLock = False
        for o, a in opts:
                if o == "-v":
                        verbose = True
                if o in ("-h", "--help"):
                        usage()
                        sys.exit(0)
                if o in ("-d", "--dryrun"):
                        dryrun = True
                if o in ("-f", "--force"):
                        ignoreLock = True
                if o in ("-c", "--config"):
                        configFile = a

        try:
                if (not os.path.isfile(configFile)):
                        raise Exception
                config = loadConfig(configFile)
        except:
                print "Can't open config file: %s" % configFile
                sys.exit(3)

        debug(config)
        if (dryrun):
                info("*** This is a DRY RUN ONLY ***")

        if (setAppLock(ignoreLock)):
                if (not mountDisk()):
                        info("Error during mount - quitting...")
                        sys.exit(1);

                if (not checkMagicFile()):
                        info("Disk doesn't have the correct magic file - quitting...")
                        unmountDisk()
                        sys.exit(1)

                info("Starting backups...")
                for ii in range(1, int(config["general.backup_count"])+1):
                        try:
                                processBackup(ii)
                        except Exception,e:
                                info("Got error: " + traceback.format_exc())
                                curLog.success = False
                                curLog.error = str(e)

                logSetSection("end")
                # get free disk space report for USB disk
                (dfExitStatus, dfReport) = runCmd("/bin/df -h %s" % config["general.disk_mountpoint"])
                info(dfReport)
                if (dfExitStatus != None):
                        curLog.success = False

                # unmount the USB disk
                if (not unmountDisk()):
                        info("Error unmounting USB disk")
                        curLog.success = False
        else:
                # there is an instance already running - send & save a report of this
                curLog.success = False
                curLog.error = "Existing backup instance already running"
                dfReport = ""

        # create the backup report
        results = createReport(dfReport)

        try:
                # log report to file
                logReport(results)
        except:
                print("Error logging results: " + traceback.format_exc())

        if (config["email.enable"].lower() == "yes"):
                try:
                        # email report
                        emailReport(results)
                except:
                        print("Error emailing results: " + traceback.format_exc())

        print "\n\nBACKUP SUMMARY:\n" + results.summary

if __name__ == '__main__':
    main()

rdiff-backup-usb.ini

[general]
# path to the mount point of the USB disk. fstab should be configured
# so that 'mount <disk_mountpoint>' just works. (trailing '/')
disk_mountpoint = /mnt/backup_usb/

# path to a mountpoint for the lvm snapshot (no trailing '/')
lvm_mountpoint = /mnt/backup_lvm
# what the logical volume name is for the backup snapshots
# BE CAREFUL THIS DOESN'T MATCH AN EXISTING VOLUME OR IT WILL BE DELETED
lvm_snapshotname = rdiff-backup-usb
# how big to make our LVM snapshots (we only do one at a time)
lvm_snapshotsize = 2G

# path to backups within the disk (trailing '/')
backup_path = backup/

# file that should be located in the root of the backup disk for us to
# start using it
access_file = BACKUP_OK

# log file
log_file = /var/log/backup.log

# lock file
lock_file = /var/lock/rdiff-backup-usb

# path to rdiff-backup
rdiffbackup_bin = /usr/bin/rdiff-backup

# global includes and excludes (no leading or trailing '/')
# these are included _before_ the includes/excludes from a backup below
includes =
excludes = tmp var/tmp var/log var/cache proc mnt media sys

# how many backups are we doing?
# there needs to be one section for each backup location
# the section title should be named in numeric order starting from 1
backup_count = 2

[email]
# enable sending of reports by email
enable = yes
# from and to addresses
from = rdiff-backup-usb@example.com
to = backupmaster@example.com
# base subject line for the email
subject = Backup: MYSERVER
smtp_server = localhost

[1]
# the directory name this will be stored under on the disk
name = dom0_server
# is this an lvm volume
lvm = no
# path to the backup source (leading and trailing '/')
source = /
# directories to include relative to source (no leading or trailing '/')
#includes =
# directories to exclude relative to source (no leading or trailing '/')
#excludes =

[2]
# the directory name this will be stored under on the disk
name = vserver1
# is this an lvm volume
lvm = yes
# the lvm volume group its on
lvm_vg = vg-myserver
# the lvm logical volume its on
lvm_lv = vserver1
# Any mount options - specify them here in the same form as for mount (ie. no spaces)
# NOTE: nouuid is required for lvm snapshots on xfs filesystems
mountoptions = nouuid
# path to the backup source within the lvm volume (leading and trailing '/')
source = /
# directories to include relative to source (no leading or trailing '/')
#includes =
# directories to exclude relative to source (no leading or trailing '/')
#excludes =
Personal tools