BackupLevels
From RdiffBackupWiki
Advantages of backup levels are better control over the
* backup size * restore speed * coverage and granularity of backups.
ATM, my backup volume is filling up more and more with every increment that is created. With increments, that size of the backup is
backup_size =3D size-of-data + ( #-of-incremental-backups *
average-size-of-an-incremental-backup )
and so it grows with each of my daily backups.
When I need or want to save space, currently the only option that rdiff-backup gives me is to delete old increments (so I cannot get old files back anymore).
Backup levels would give me the ability to
* create "increments" that are based on the state from an older
"increment" ("increments" may now be called differentials)
* overwrite unneeded old "increments" with newer "increments".
This would allow for keeping the number of increments constant and for minimizing the duplicate information in the increments, while keeping some redundancy.
The backup size becomes
backup_size =3D ( #-of-level0-backups * size-of-data ) +
( #-of-"incremental"-(level~[[1-x]])-backups *
average-size-of-an-"incremental"-(level~[[1-x]])-backup )
One backup scheme may look like this:
level8 ''' ''' *
level7 ''' ''' *
level6 ''' ''' *
level5 ''' ''' *
level4 ''' ''' *
level3 ''' ''' *
level2 *
level1 *
level0 *
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
This means I keep one full (level0) backup and only 8 "increments" (differentials). A restore would have to look at up to 5 "increments" to collect all data (all diffs). I can restore all files up to one week backwards. By using more levels/"increments", I can extend that period or introduce snapshot "increments" (e.g. at the last four Sundays, then at the beginning of the last four months, etc.).
The real advantage is that the backup size only grows with the size of the data and the size of the changes of the data, not the number of invocations of the backup program. Also, by buying more backup space I can increase the coverage of the backup (longer time backwards, by using snapshots and/or more "increments") and/or its granularity (more levels for the "incremental" scheme).
I'm not sure the poster understands how rdiff-backup works. Rdiff-backup only stores incrementals/differentials/diffs between each backup. To restore a file from 3 backup sessions ago, it applies the diff from 1 session, 2 sessions, and then 3 sessions ago. This seems a lot like the scheme described, no? Rdiff-backup has always scaled with the size of the data and the size of the changes to the data.
I think the first poster does understand the workings of rdiff-backup, and the concept of granularity is an interesting one (and also one I am missing).
I have the following backup-schedule in mind:
Day Backuptype Number 1 FS Year/week (e.g 2007/26) 2 DI 1 3 DI 2 4 DI 3 5 DI 4 6 DI 5 7 DI 6 8 WI I 9 DI 1 10 DI 2 11 DI 3 12 DI 4 13 DI 5 14 DI 6 15 WI II 16 DI 1 17 DI 2 18 DI 3 19 DI 4 20 DI 5 21 DI 6 22 WI III 23 DI 1 24 DI 2 25 DI 3 26 DI 4 27 DI 5 28 DI 6
where
FS = Full Save
WI = Weekly Incremental
DI = Daily Incremental
and (very important!)
all incrementals are increments with respect to the previous incremental of the same type.
An alternative scheme that is more in tune with how rdiff-backup works would be to allow intermediate incrementals to be deleted. Then you could:
- Run a daily backup
- Delete every incremental that was older than 7 days except those run on a Friday (for example)
- Delete every incremental that was older than 90 days except those run on the first Friday of the month
This wouldn't be that hard as it would just involve rolling up the diffs from the deleted incremental into the previous incremental
---
This doesn't actually provide much benefit because the rolled up diffs would be the same size as the sum total of the originals.
---
The ability to delete intermediate incrementals is filed separately as DeleteIntermediate. If "rolling up" simply involved concatenating the diffs, it's correct that no space would be saved. However, the idea is that some of the diffs are overlapping, and we can save space when merging incrementals by deleting the information that was changed in one incremental but overwritten by a subsequent incremental. I won't address whether this fits the requirements of the original poster. David@sickmiller.com 15:51, 7 June 2008 (EST)
