[Nix-dev] bup - test suite

Mathijs Kwik mathijs at bluescreen303.nl
Sun Sep 1 15:42:23 CEST 2013


On Sun, Sep 1, 2013 at 3:00 PM, Marc Weber <marco-oweber at gmx.de> wrote:
> Excerpts from Mathijs Kwik's message of Sun Sep 01 14:51:09 +0200 2013:
>> I would like to warn you for bup though.
>> I've used it for daily backups for at least half a year.
> Well - I guess you should have started a new repo each month ..

That's what I did. For at least 6 months :)
But after the third month, I thought it would be a nice tryout to see
if restoring worked as advertised.
So before throwing the previous month away, I used it for a restore
and found out it had nothing :(

After noticing this 2 months in a row, I started checking "bup ls"
daily, but didn't hit the issue.
After starting over again (next month) the daily bup ls check did
notice my repo get killed after 18 days or so.
I could not find anything special in the logs and in the repo, but of
course I do not have a backup of my repo to check for differences.
Also, issues like this are fairly hard to reproduce (to file bug
reports) and as backups contain quite some private data, providing
snapshots or machine access to devs is not an option.


>
>> I've started over a few times, thinking it was something I messed up
>> or because of version upgrades, but it happened more than once.
>> Currently, I moved to using "btrfs send", which is awesome, but
>> somewhat experimental too :)
> Didn't knew about it.
>
>> Anyway, at least when it fails, it reports about it :)
>> supports deleting old revisions
> I've added this as warning. If you think you should have a new full
> backup each month anyway, then it does not matter that much.

It does, for me, it happened 3 out of 4 months.
But of course this might depend on my data.
All I'm saying is: automate regular checking the consistency of your repo

>
> So what do you do now?
> btrfs send on the machine, and btrfs receive on the backup machine?

Well of course it all starts with btrfs snapshots. I do them every 6
hours, as they are so cheap. They are taken in an instant, take up
very little space if you don't change a lot on disk, and far less than
block-based snapshotting (lvm) gives you. Also, you do not have to
reserve space for them upfront (live lvm).

I sync these snapshots with my backup machine daily, which itself sync
my most important stuff to an off-site cloud vps overnight. The cool
part is: this all happens with the same tools and with 1 on-disk
format (normal, accessible volumes).
"btrfs send" itself works incrementally (and even has settings where
you can point out which other older snapshots the receiving end has,
so it can cheat if you put back data that you had some days ago, but
no longer with the last snapshot).
This not only means the transfers are relatively small, but by using
the fact that btrfs is really just 1 big journal, it does not need to
scan everything to find out what has changed. Every other backup
solution needs to do this, which will take a long time (and slow your
system) if you have a few gigs of data, shattered over many thousands
of files. Of course they can cheat by looking at ctimes/atimes or some
index they keep, but it still takes a lot of time and those cheats can
lead to data loss if you happen to use software that changes file
contents, while keeping ctime/mtime/atime and size the same.

Lastly, I employ a simple cleaner cronjob on all machines.
So I only keep the 6-hour snapshots on my laptop for 3 days, while my
backup system throws them out and only keeps 1 per day for up to a
week, it keeps 1 per week for up to 2 months. The off-site VPS only
keeps the most recent one (as cloud disk space is relatively pricey).
So you see these locations have different data-safety purposes. The
local ones are for when I make a mistake and find out quickly. The
ones on the backup machine are there for longer-term backup, and
mainly provides RAID, which my laptop doesn't provide. The off-site
backup is only there in case my house explodes or something, so it
doesn't need history. So the nice thing is that all these purposes are
served by a single tool.

>
>> and does not have a separate on-disk format like bup (which
>> effectlively doubles space requirements if you want your bup history
>> available on the same machine as your live data).
> I compared a use case (500 MB with many git repos) and it was not that
> bad - eg compared tar j or storebackup - but fastest (initial and
> incremental backup). The initial backup was 4 times faster.

I'm not talking about the stuff bup keeps locally. I'm mainly talking
about the real repo size.
I know it is able to compress normal files quite well. But try making
a bup repo of your 100g+ music collection.
It will take up another 95g at least, which is wasteful if you're on a
laptop with a 160g ssd or something.

Of course you decide not to use bup for your music (use rsync for
that), or to only make backups to a remote bup repo and use some
snapshotting solution locally (for protecting against mistakes), but
then you end up using 3 tools

>
> It could be a little risk trusting btrfs as main fs and as backup fs :)
> If you get a kernel which is bad it could destroy both ..

I fully agree. That's why I don't upgrade my backup machine very often
(or automatically), and when I do, it's to a version I've been running
for at least a week on my main laptop. And the VPS machine waits
another week after that.

The same is true for upgrading bup though. In theory it can destroy a
repo without reporting errors. And you don't usually backup bup repos
themselves :) Your test suite is not gonna find out as it tests _new_
repos.

>
>> https://github.com/bluescreen303/bluenix/blob/master/pkgs/kernel/btrfs-updates-against-3.10.7.patch
>> , which is just a plain "git diff" patch comparing mainline with
>> linux-btrfs.
>
> If nobody is heavily depending on it - then the build failure is not
> that bad. So I'll just wait. bup is closer to a release anyway.
>
> I was a little bit surprised that it says "You can fuse the backup" -
> but only got 000000/0ef234e like files ..
>
> Marc Weber
> _______________________________________________
> nix-dev mailing list
> nix-dev at lists.science.uu.nl
> http://lists.science.uu.nl/mailman/listinfo/nix-dev


More information about the nix-dev mailing list