File Backups – Thoughts

I have used rsnapshot a lot in the past to provide snapshots in time for my backup purposes, and I loved how I was able to to pull data from remote machines to backup locally. However, doing the reverse seemed to be a lot trickier. I wanted to push backups to a server instead of have the server pull data. Because when the server started its backup routine, some machines may be switched off. So rather than the server be responsible for backing up the clients, I wanted the clients to be responsible for backing themselves up onto the server itself.

rsnapshot didn’t seem to lend itself to easily to do this, but then I thought “why not use sshfs?”. With sshfs, I can mount a directory from a remote SSH server on the local filesystem as a directory, use that as a the snapshot root and it should work. The only downside is that since rsnapshot must run as root to do a full system file backup, it also means the sshfs must be mounted as root too, and therefore it tries to connect as root to the remote server. This might be fine if you wanted to do a full remote system backup, but enabling root SSH access is a potential security hole. Possible workaround this by making a standard user who is a member of the root or admin group (haven’t checked whether this would work yet.)

Then I found out about rdiff-backup.

rdiff, for those unfamiliar with the term, is rsync diff (or reverse diff, depending on your school of thought.) It uses the rsync algorithm (the same used by rsnapshot) to create a diff file (or delta) which, when applied to a file, can produce another file – a bit like patching using a patch file. Since we only save the differences between a file and its other version, the actual storage space is low.

rsnapshot utilises hard links to store same versions of files across multiple backups, but has a full copy of each new version. rdiff-backup stores the latest version, and stores diffs that enable you to go back to a previous point in time.

In this sense, rsnapshot works like a full backup style, storing each new version in its entirety, whereas rdiff-backup works as full-plus-(reverse) differential. Restoring from the latest backup on either tool takes the same time, but restoring from an older backup would take longer for rdiff-backup, because it would have to assemble that particular version of the files via the diff files, whereas with rsnapshot, the full version is stored (although at the cost of more space). rdiff-backup works great for files which change often, but change only slightly. Rsnapshot works great for files which change rarely, but change entirely. Using a combination of both might be a good idea (say, use Rsnapshot for /home and rdiff-backup for /var and /etc)