UpdateErrorRetry
From RdiffBackupWiki
My scenario:
I do backups of the mailboxes of a mailserver using rdiff-backup. Some (few) of many mailboxes will change while backup resulting in "UpdateError mailbox_xy Updated mirror temp file /var/backup/rdiff-backup.tmp.310 does not match source". Because of this error, the changes are not saved and the backups of these mailboxes are still that of the last successful run. I would like to have an option "--retry-changed-files" or something, that cuases rdiff-backup to process the files again, that could not be saved because the source file changed. The re-processing of these files should be done after the normal backup run (which takes usually some time). So the chances for a successful backup of the mailboxes will be much bigger. Maybe one could also specify how many times rdiff-backup sould try to process the file and/or if it should wait (sleep) a certain amount of time before it starts to re-process the files.
So, what do you think of this? Could this be useful to someone else than me?
-- Marco Steinacher <marco@websource.ch>
It certainly seems like a useful feature. However, implementing it looks like it could be tricky.
The patch and patch_and_increment functions in backup.py drive the two processes that can lead to these UpdateErrors. Both functions run as 'for diff in ...' The diff iterator is based off the sig iter, which calls the CCPP iterator, which in turn is based on the iterators that actually go over the files on the source and destination sides. Those iterators come from selection.py.
It would seem that simply appending the troublesome file to the source/dest iterators so they would be handled again would do the trick. However, the CCPP iterator caches files that have been seen "recently", and it could get very confused by a file that's in the cache twice (see shorten_cache function). Care would have to be taken to keep the cache in a clean state.
--AndrewFerguson 11:52, 16 June 2007 (EST)
If the source is using LVM (logical volume manager), then an LVM snapshot is a great way to handle this. Just before the backup, freeze the database server (if any), snapshot to a new device with lvcreate, and then let everything continue on the original while you back up the snapshot. When finished, lvremove the snapshot. I'm able to do this on one system and it works beautifully. On another system that doesn't (and can't) use LVM, I get plenty of UpdateErrors from log files that are updated during the backup. --Chris 23:32, 23 July 2007 (EST)
Any update on this?
So rdiff-backup cannot be used to copy frequently changing files, e.g. log files on busy servers? No mailbox backups? Wtf?!?!
IMHO, this is an unexpected and totally undesirable behavior for a program that has "backup" on its name.
Another approach for the retry would be to sleep a few (configurable) seconds and retry the file again a few (configurable) number of times. I didn't look at the sources, but something along the code:
retries = 0
errors = 0
while retries < opt_maxretries:
try:
retries = retries + 1
sync_file(filename)
except ( UpdateError ):
errors = errors + 1
print 'UpdateError -- will retry ', filename, ' in ', opt_sleep, ' seconds'
sleep(opt_sleep)
else:
errors = 0
if errors > 0:
print 'Update error: ', filename, ' changed during update';
Sorry but using LVM to handle this issue is not always possible, and shouldn't be required for proper use of rdiff-backup, IMHO. There are many sysadmins with some big systems out there (I'm one of then) and to do a full reformat is out of question.
-- FV Mon, 24 May 2010 11:40:13 -0300
