USBBackupScript
From RdiffBackupWiki
Contents |
Overview
This is a relatively complete python script to perform a series of rdiff-backups to a USB hard disk. I use it for backing up different disks on a few Xen servers, where LVM is used to create disks for each virtual machine, as well as on other standalone machines.
Note that this has only been run/tested under Ubuntu. It should work on other linuxes, but maybe you could add a comment to the page (or your fixes) with your findings.
Basic process:
- Put rdiff-backup-usb.py into /usr/local/bin or somewhere and make it executable
- Put rdiff-backup-usb.ini into /etc/backup
- Edit /etc/backup/rdiff-backup-usb.ini for your system - see the Configuration section below
- Prepare your backup disk(s). This involves:
- Formatting them
- Creating a file called
BACKUP_OKin the root directory - Creating a
backupfolder in the root directory
- Create the mountpoints you're using (/mnt/backup_usb and /mnt/backup_lvm)
- Add an fstab entry so that it can be mounted by a command like
mount /mnt/backup_usb - Add udev rules if you want it to automatically start when the disk is connected (see the Udev section below)
- Run it in dryrun mode, then for real (see the Running section)
Configuration
Basically, read the rdiff-backup-usb.ini file and customise it to your site. Don't delete options from the file, because the script will probably break. Add a new section for each backup set, and remember to set the general.backup_count to the right number when you're done. Paths for all includes/exludes are combined with the mount path (for LVM) and the source path for each backup.
If you need to change the arguments to rdiff-backup or the email text, edit the .py file.
Running
When you run the script, it will do the following things (assuming you haven't changed the config):
- Load the config file from /etc/backup/rdiff-backup-usb.ini
- Check for any other instances of the script that are running (in case one got stuck/took too long). If it finds one it will log and email.
- Mount the USB disk at /mnt/backup_usb). If this fails, its not an error (so you can run from cron) but it will return a result code of 1
- Check that a file BACKUP_OK is present in the root directory. If this fails, its not an error but it will return a result code of 1
- Run each backup defined in the config file, in numeric order:
* Create a directory for the backup name if it doesn't exist - eg. /mnt/backup_usb/backup/<name> * Create an LVM snapshot of the volume and mount it (if lvm=yes is set) * Run rdiff-backup * Unmount the LVM snapshot and remove it (lvm backups only)
- Check the free space remaining on the USB disk
- Unmount the USB disk
- Produce a report for each backup
- Append the report to /var/log/backup.log
- Email the full report as an attachment to the admin, with the summary in the main message body.
If it encounters errors running a single backup it will continue with the remaining one. It also tries hard to clean up after itself (removing snapshots, unmounting things, etc). And it won't run again if it didn't exit cleanly (it uses a lock file).
You can specify some command-line options:
-h print out some help
-d dryrun - this will go through the paces without doing anything dangerous. It will mount the USB disk, but it won't run rdiff-backup OR create/mount/remove any LVM snapshots
-c specify an alternative config file to use
-f ignore existence of the lock file (/var/lock/rdiff-backup-usb) and run anyway
-v display verbose messages for debugging
UDev
If you want your backups to automatically run when you connect a disk we can trigger the script from udev. This is probably the most Ubuntu-specific bit, but there is great documentation on udev rules at http://www.reactivated.net/writing_udev_rules.html if you get stuck.
Create a file called 70-rdiff-backup-usb.rules in /etc/udev/rules.d. The contents are a single line:
SUBSYSTEM=="block", ACTION=="add", ENV{ID_FS_LABEL}=="BACKUP_*", SYMLINK+="rdiff_backup_usbdisk", RUN+="/usr/local/bin/rdiff-backup-usb.py &"
When a disk is connected with a filesystem label starting with "BACKUP_", we create a symlink to it at /dev/rdiff_backup_usbdisk, and run the backup script. Our fstab entry will then look like this (for an ext2 disk):
/dev/rdiff_backup_usbdisk /mnt/backup_usb ext2 noauto 0 0
It doesn't matter whether the usb disk shows up as sdc, sdd, sde, or anything - it will always have a symlink at /dev/rdiff_backup_usbdisk. And it handles a series of disks (BACKUP_01, BACKUP_02, etc) in the same way. You can set filesystem labels using e2label (ext2/ext3) or xfs_admin (xfs).
The only problem with this trick is restoring or otherwise mounting the disk (when you don't want a backup) on the same system. Maybe someone could suggest how to filter it by USB port, or start the backup after 5 minutes unless XYZ happens, or ...
If you don't want the auto-start you can also use this udev script to consistently get device names for putting into /etc/fstab (remove the RUN clause), or check out /dev/disk/by-* for similar ways to do it.
Important
Don't blame me if your disks get deleted or the backups don't work, the code is all there for you to review and change yourself! You can reach me with questions or problems at robert.coup@onetrackmind.co.nz, or just change the source here.
The Source
rdiff-backup-usb.py
#!/usr/bin/env python
# rdiff-backup-usb.py
# Robert Coup, One Track Mind
# July 2006
"""
A script for doing backups using rdiff-backup. It is designed to be run
from a udev usb-disk-arrived event.
"""
DEFAULT_CONFIG_PATH = "/etc/backup/rdiff-backup-usb.ini"
DEFAULT_RDIFFBACKUP_ARGS = [
"--exclude-device-files",
"--exclude-sockets",
"--print-statistics",
"--preserve-numerical-ids"
]
EMAIL_BODY = """
The backup started at %s is now complete.
You may now safely turn off then disconnect the external hard
disk. To start the next backup, just re-attach the external
hard disk.
Here is the summary report for the backup run - see the attached log
for more details.
-----------------------
%s
-----------------------
USB Disk Space:
%s
"""
#imports
import os
import os.path
import time
import sys
import ConfigParser
import getopt
import traceback
import email
import email.MIMEText
import email.MIMEMultipart
import smtplib
import atexit
# globals
config = {}
actionLog = []
curLog = None
verbose = False
dryrun = False
class LvmError(Exception):
pass
class MountError(Exception):
pass
class RdiffBackupError(Exception):
pass
class LogSection:
def __init__(self, sectionName):
self.name = sectionName
self.startTime = time.time()
self.log = []
self.success = True
self.endTime = 0
self.error = None
class BackupReport:
def __init__(self):
self.success = False
self.log = ""
self.email = ""
def debug(msg):
if (verbose):
log(msg)
def info(msg):
log(msg)
if (not verbose):
print msg
def log(msg):
global curLog
curLog.log.append(str(msg))
if (verbose):
print msg
def logCmd(cmd, doOnDry=False):
if (dryrun and not doOnDry):
log("SKIPPING running command (dryrun): " + cmd)
return None
(exitStatus, output) = runCmd(cmd)
log(output)
return exitStatus
def infoCmd(cmd, doOnDry=False):
if (dryrun and not doOnDry):
info("SKIPPING running command (dryrun): " + cmd)
return None
(exitStatus, output) = runCmd(cmd)
info(output)
return exitStatus
def runCmd(cmd):
app = os.popen(cmd + " 2>&1", 'r')
output = app.read()
return (app.close(), output)
def logSetSection(newSectionName):
global actionLog, curLog
if (curLog != None):
curLog.endTime = time.time()
nl = LogSection(newSectionName)
actionLog.append(nl)
curLog = nl
def logReport(report):
lf = open(config["general.log_file"], "a")
lf.write("%s - New Backup Run\nSummary:\n" % time.strftime('%c %Z', time.localtime(time.time())) )
lf.write(report.summary)
lf.write("\n Full log:\n")
lf.write(report.log)
lf.write("\n")
lf.close()
def emailReport(report):
msg = email.MIMEMultipart.MIMEMultipart()
msg["To"] = config["email.to"]
msg["From"] = config["email.from"]
if (report.success):
msg["Subject"] = config["email.subject"] + " - SUCCESS"
else:
msg["Subject"] = config["email.subject"] + " - FAILURE"
if (dryrun):
msg["Subject"] += " *** DRY RUN ONLY ***"
msg.preamble = "Mime message\n"
msg.epilogue = ""
msgBody = email.MIMEText.MIMEText(report.email)
msg.attach(msgBody)
logAttachment = email.MIMEText.MIMEText(report.log)
logAttachment.add_header("Content-disposition", "attachment", filename="backup_log.txt")
msg.attach(logAttachment)
smtp = smtplib.SMTP(config["email.smtp_server"])
smtp.sendmail(config["email.from"], config["email.to"], msg.as_string())
smtp.quit()
def createReport(diskFreeReport):
global curLog
curLog.endTime = time.time()
r = BackupReport()
full = []
summary = []
allStart = actionLog[0].startTime
allEnd = curLog.endTime;
if (dryrun):
summary.append("*** DRY RUN ONLY ***")
summary.append("Started at: %s" % time.strftime('%c %Z', time.localtime(allStart)))
summary.append("Ended at: %s" % time.strftime('%c %Z', time.localtime(allEnd)))
summary.append("Total time: %d minutes" % ((allEnd - allStart) / 60))
r.success = True
for sec in actionLog:
r.success = r.success and sec.success
full.append("\n----------- %s ------------" % sec.name)
full.append(" Start: %s" % time.strftime('%c %Z', time.localtime(sec.startTime)))
full.append(" End: %s" % time.strftime('%c %Z', time.localtime(sec.endTime)))
full.append(" Successful: %s" % sec.success)
if (sec.error != None):
full.append(" Error: %s" % sec.error)
full.append(" -----------------------")
full.extend(sec.log)
summary.append(" %s: %s" % (sec.name, sec.success))
summary.append("Overall success: %s" % r.success)
r.log = "\n".join(full)
r.summary = "\n".join(summary)
r.email = EMAIL_BODY % (time.strftime('%c %Z', time.localtime(allStart)), r.summary, diskFreeReport)
return r
def addArgList(configKey, baseArgument):
al = []
if (config.has_key(configKey) and len(config[configKey]) > 0):
for item in config[configKey].split(" "):
al.append(baseArgument + item)
return al
def processBackup(index):
idx = str(index) + "."
name = config[idx+"name"]
logSetSection("%d - %s" % (index, name))
info("Beginning backup %d: %s ..." % (index, name))
basePath = config[idx+"source"]
destDir = config["general.disk_mountpoint"] + config["general.backup_path"] + name
lvm = (config[idx+"lvm"].lower() == "yes")
if (lvm):
basePath = config["general.lvm_mountpoint"] + config[idx+"source"]
# build up the command line
cmdArgs = DEFAULT_RDIFFBACKUP_ARGS[:]
cmdArgs.extend(addArgList(idx+"includes", "--include " + basePath))
cmdArgs.extend(addArgList(idx+"excludes", "--exclude " + basePath))
cmdArgs.extend(addArgList("general.includes", "--include " + basePath))
cmdArgs.extend(addArgList("general.excludes", "--exclude " + basePath))
cmdArgs.append(basePath)
cmdArgs.append(destDir)
# create destination directory (if it doesn't exist)
if (not os.path.isdir(destDir)):
os.mkdir(destDir)
if (lvm):
# create a snapshot
cmdLine = "/sbin/lvcreate -L%s -s -n %s /dev/%s/%s" % (config["general.lvm_snapshotsize"], config["general.lvm_snapshotname"], config[idx+"lvm_vg"], config[idx+"lvm_lv"])
log("Creating LVM snapshot: %s" % cmdLine)
time.sleep(3)
res = logCmd(cmdLine)
if (res != None):
raise LvmError("Creating snapshot: %d" % res)
try:
if (lvm):
# mount the snapshot
if (config.has_key(idx+"mountoptions")):
mountOpts = "," + config[idx+"mountoptions"]
else:
mountOpts = ""
cmdLine = "/bin/mount -s /dev/%s/%s %s -oro%s" % (config[idx+"lvm_vg"], config["general.lvm_snapshotname"], config["general.lvm_mountpoint"], mountOpts)
log("Mounting LVM snapshot: %s" % cmdLine)
res = logCmd(cmdLine)
if (res != None):
raise MountError("Mounting snapshot: %d" % res)
try:
# run rdiff-backup
cmdLine = "%s %s" % (config["general.rdiffbackup_bin"], " ".join(cmdArgs))
log("Running rdiff-backup: " + cmdLine)
res = infoCmd(cmdLine)
if (res != None):
raise RdiffBackupError("Result code = %d" % res)
finally:
if (lvm):
# unmount the snapshot
cmdLine = "/bin/umount %s" % config["general.lvm_mountpoint"]
log("Un-mounting LVM snapshot: %s" % cmdLine)
res = logCmd(cmdLine)
if (res != None):
raise MountError("Un-mounting snapshot: %d" % res)
finally:
if (lvm):
# delete the snapshot
cmdLine = "/sbin/lvremove -f /dev/%s/%s" % (config[idx+"lvm_vg"], config["general.lvm_snapshotname"])
log("Deleting LVM snapshot: %s" % cmdLine)
time.sleep(3)
res = logCmd(cmdLine)
if (res != None):
raise LvmError("Deleting snapshot: %d" % res)
info("Backup completed successfully.")
def checkMagicFile():
""" Checks that the disk has the magic file in the root """
return os.path.exists(config["general.disk_mountpoint"] + config["general.access_file"])
def mountDisk():
cmdLine = "/bin/mount %s" % config["general.disk_mountpoint"]
log("Mounting: %s: %s" % (config["general.disk_mountpoint"], cmdLine))
res = logCmd(cmdLine, True)
if (res != None):
log("Mount error: %s" % res)
return (res == None)
def unmountDisk():
cmdLine = "/bin/umount %s" % config["general.disk_mountpoint"]
log("Un-mounting %s: %s" % (config["general.disk_mountpoint"], cmdLine))
res = logCmd(cmdLine, True)
if (res != None):
log("Un-mount error: %s" % res)
return (res == None)
def loadConfig(file, config={}):
config = config.copy()
cp = ConfigParser.ConfigParser()
cp.read(file)
for sec in cp.sections():
name = sec.lower()
for opt in cp.options(sec):
config[name + "." + opt.lower()] = cp.get(sec, opt).strip()
return config
def setAppLock(ignore):
lock = config["general.lock_file"]
if (os.path.exists(lock)):
info("Lock file %s exists!" % lock)
if (not ignore):
return False
else:
info("ignoring existing lockfile...")
else:
file(lock, "w").close()
atexit.register(lambda:os.remove(lock))
return True
def usage():
print """
rdiff-backup-usb
usage: rdiff-backup-usb.py [-v] [--dryrun] [--config config_file]
-v verbose
--config,-c specify an alternative config file (default is %s)
--dryrun,-d do a dry run without running lvm or rdiff-backup stuff
--force,-f ignore an existing lockfile and run anyway
""" % DEFAULT_CONFIG_PATH
def main():
global config, verbose, dryrun, startTime
logSetSection("begin")
try:
opts, args = getopt.getopt(sys.argv[1:], "dfhc:v", ["dryrun", "force", "help", "config="])
except getopt.GetoptError:
usage()
sys.exit(2)
configFile = DEFAULT_CONFIG_PATH
ignoreLock = False
for o, a in opts:
if o == "-v":
verbose = True
if o in ("-h", "--help"):
usage()
sys.exit(0)
if o in ("-d", "--dryrun"):
dryrun = True
if o in ("-f", "--force"):
ignoreLock = True
if o in ("-c", "--config"):
configFile = a
try:
if (not os.path.isfile(configFile)):
raise Exception
config = loadConfig(configFile)
except:
print "Can't open config file: %s" % configFile
sys.exit(3)
debug(config)
if (dryrun):
info("*** This is a DRY RUN ONLY ***")
if (setAppLock(ignoreLock)):
if (not mountDisk()):
info("Error during mount - quitting...")
sys.exit(1);
if (not checkMagicFile()):
info("Disk doesn't have the correct magic file - quitting...")
unmountDisk()
sys.exit(1)
info("Starting backups...")
for ii in range(1, int(config["general.backup_count"])+1):
try:
processBackup(ii)
except Exception,e:
info("Got error: " + traceback.format_exc())
curLog.success = False
curLog.error = str(e)
logSetSection("end")
# get free disk space report for USB disk
(dfExitStatus, dfReport) = runCmd("/bin/df -h %s" % config["general.disk_mountpoint"])
info(dfReport)
if (dfExitStatus != None):
curLog.success = False
# unmount the USB disk
if (not unmountDisk()):
info("Error unmounting USB disk")
curLog.success = False
else:
# there is an instance already running - send & save a report of this
curLog.success = False
curLog.error = "Existing backup instance already running"
dfReport = ""
# create the backup report
results = createReport(dfReport)
try:
# log report to file
logReport(results)
except:
print("Error logging results: " + traceback.format_exc())
if (config["email.enable"].lower() == "yes"):
try:
# email report
emailReport(results)
except:
print("Error emailing results: " + traceback.format_exc())
print "\n\nBACKUP SUMMARY:\n" + results.summary
if __name__ == '__main__':
main()
rdiff-backup-usb.ini
[general] # path to the mount point of the USB disk. fstab should be configured # so that 'mount <disk_mountpoint>' just works. (trailing '/') disk_mountpoint = /mnt/backup_usb/ # path to a mountpoint for the lvm snapshot (no trailing '/') lvm_mountpoint = /mnt/backup_lvm # what the logical volume name is for the backup snapshots # BE CAREFUL THIS DOESN'T MATCH AN EXISTING VOLUME OR IT WILL BE DELETED lvm_snapshotname = rdiff-backup-usb # how big to make our LVM snapshots (we only do one at a time) lvm_snapshotsize = 2G # path to backups within the disk (trailing '/') backup_path = backup/ # file that should be located in the root of the backup disk for us to # start using it access_file = BACKUP_OK # log file log_file = /var/log/backup.log # lock file lock_file = /var/lock/rdiff-backup-usb # path to rdiff-backup rdiffbackup_bin = /usr/bin/rdiff-backup # global includes and excludes (no leading or trailing '/') # these are included _before_ the includes/excludes from a backup below includes = excludes = tmp var/tmp var/log var/cache proc mnt media sys # how many backups are we doing? # there needs to be one section for each backup location # the section title should be named in numeric order starting from 1 backup_count = 2 [email] # enable sending of reports by email enable = yes # from and to addresses from = rdiff-backup-usb@example.com to = backupmaster@example.com # base subject line for the email subject = Backup: MYSERVER smtp_server = localhost [1] # the directory name this will be stored under on the disk name = dom0_server # is this an lvm volume lvm = no # path to the backup source (leading and trailing '/') source = / # directories to include relative to source (no leading or trailing '/') #includes = # directories to exclude relative to source (no leading or trailing '/') #excludes = [2] # the directory name this will be stored under on the disk name = vserver1 # is this an lvm volume lvm = yes # the lvm volume group its on lvm_vg = vg-myserver # the lvm logical volume its on lvm_lv = vserver1 # Any mount options - specify them here in the same form as for mount (ie. no spaces) # NOTE: nouuid is required for lvm snapshots on xfs filesystems mountoptions = nouuid # path to the backup source within the lvm volume (leading and trailing '/') source = / # directories to include relative to source (no leading or trailing '/') #includes = # directories to exclude relative to source (no leading or trailing '/') #excludes =
