SparseFiles
From RdiffBackupWiki
(text deleted)..e files nicely would be lovely, but there's some evidence it is not currently possible to do it in a portable fashion.
-- Andrew Bressen <bressen@savannah.nongnu.org>
Rather than worrying about copying a file's sparseness you could just make all output files sparse by default - so when writing, write in blocks and deal with a block of zeroes as a nop followed by a seek to the next set of non-zero data.
-- Nigel Metheringham <nigel.metheringham@pobox.com>
Implementing sparse files that way may save some HD space, but I think in order for it to be worth it, sparse files need to be detected on the source site. For instance, I had some bug on my machine and one of the log files became 1+TB long, almost all of it a "hole". So even if rdiff-backup wrote a sparse file on the destination, it would still take absurdly long to process the file, and to transfer it.
Also even if rdiff-backup could detect spare files, it would take a modification of the rsync algorithm in order to patch/diff spare files efficiently. As it is now they would just process buffer after buffer full of 0's.
Anyway, Nigel's suggestion is a good one and relatively easy to implement.
I guess it depends on how/why sparse files are used, and what the consequences of not supporting them are.
-- BenEscoto
Virtualization products like VMware and Xen create virtualized disks as sparse files. It's not uncommon to create a virtual machine with a virtual disk that can grow to 10's of gigabytes whose sparse image is well under 1 gigabyte. Until rdiff-backup provides support for sparse files, we'll have to find another means to back them up.
-- Randy
Sparse File Patch for rdiff-backup CVS
Update: Mon Feb 14 19:07:12 PST 2011
This sparse file patch does work, however, the code that finds blocks of 0's and seeks the write-cursor to create the sparse block is rather slow. I have been having trouble on my 600GB volume when attempting to do sparse backups. If someone has a more effective (faster) method of computing sparse blocks, please contact me and I will update the patch. Thanks! -Eric
This is a simple patch to support sparse files on all filesystems that can arbitrarily seek and write to generate sparse files. It will patch against rdiff-backup CVS at 2011/01/02, and should work for older versions too (patch: [1], announcement: [2])
This works well for sparse backup of LVM volumes (see [BlockFuse to the Rescue: rdiff-backup of LVM Snapshots and Block Devices [3]]) that have lots of holes. I have validated that this patch works successfully for my LVM snapshots. These are examples of sparse-backed-up 10GB and 50GB LVM images, and I have images as large as 600GB:
# This volume is only 20% full: root@ubuntu:/mnt/backup/lvm-snapshots# ls -slh `pwd`/vgBoot-_snap--arm--vm1 2.3G -r-------- 1 root root 10G 2010-12-23 15:08 /mnt/backup/lvm-snapshots/vgBoot-_snap--arm--vm1 # This volume is nearly full, so not much space was saved root@ubuntu:/mnt/backup/lvm-snapshots# ls -lhs `pwd`/vgBoot-_snap--SSi 46G -r-------- 1 root root 50G 2010-12-23 15:08 /mnt/backup/lvm-snapshots/vgBoot-_snap--SSi root@ubuntu:/mnt/backup/lvm-snapshots# md5sum vgBoot-_snap--SSi /dev/mapper/vgBoot-_snap--SSi 0a726c35549d83e80dd08d84d539e9da vgBoot-_snap--SSi 0a726c35549d83e80dd08d84d539e9da /dev/mapper/vgBoot-_snap--SSi root@ubuntu:/mnt/backup/lvm-snapshots# md5sum vgBoot-_snap--arm--vm1 /dev/vgBoot/arm-vm1 6c3ea4d4070dea41a5346da57e514165 vgBoot-_snap--arm--vm1 6c3ea4d4070dea41a5346da57e514165 /dev/vgBoot/arm-vm1
A few notes:
- "Sparse file support" could be auto-detected by rdiff-backup, or have an option to enable it. (Personally, I like the auto-detect method, though I am not familiar enough with rdiff-backup to provide a patch in the right place.)
- Works for local and remote-to-local sync. Local-to-remote backups require the far-end to have this patch too.
Feel free to contact me with questions or comments.
Cheers,
