Backups

Holding to my data

It was April 26th, 1999. I just got back from school and, as always, the first thing I did was to turn on my computer. It started normally, but soon I noticed that something was wrong. Instead of the normal sound of the machine booting up, there was silence.

I tried pushing the power button again. And again. And again.

Nothing.

I checked inside of the computer for signs of obvious problems but everything seemed fine. Nothing burned and all cables at the right places.

With the help of my parents, we took it to the local repair shop. There we got a quick diagnose. The motherboard was bricked and all the data on the disk was lost.

Several years later I decided to upgrade hard disk in my laptop to SSD. It was still a novel thing and there were a lot of opinions about theirs reliability, but one thing was common if something goes wrong with the disk you lose access to all the data.

Because I was determined not to keep all my data safe I decided to keep regular backups. I had my old laptop disk lying around so I used it as a hot spare. Once a week I would clone my SSD disk to it. That way if anything happened I could just swap the disks and continue using my machine.

Once I had regular backups of my laptop, a computer which was crucial to my job, I started thinking about the rest of my data and what would be the best way to protect it.

I first reached for JungleDisk. Back then it was a new solution which offered to store data in Amazon S3. All not critical (but not too big) files went into S3.

Additionally, I got myself another external disk to make incremental backups. For that Time Machine works just fine. I come across small issues when I run out of space on the disk, but recently it has been running without any problems.

Recently I switched from JungleDisk to tarsnap. The main reason being that JungleDisk tends to be CPU heavy while doing backups and I just got tired of my computer being unusable. With tarsnap, it’s not a problem. Plus it has other advantages.

To be exact I use tarsnap with acts which implements grandfather-father-son backup rotation schema. To make it work on OS X I had to modify acts a bit, but because it’s a bash script, it wasn’t that complicated (I should contribute my changes back to the acts project).

The last piece of software I use for backups is git-annex. It’s a versatile piece of software. My main use case is ensuring that my collection of photos is safe. I keep one copy in S3 and another on an external drive. git-annex takes care of ensuring that every file is stored in at least two places.

So far I didn’t have to test full recovery from my online backups. I hope I will never have to. I was able to get back to a lot or corrupted file. Regardless of that, I feel safer about my data.

By the way, the reason for my computer losing all the data, back in 1999 was Chernobyl virus. A lot of people got affected and it was quite a big news. (Knowing a great deal more about computers and reading that Wikipedia article I know that I could have recovered my data. Only if I knew. But then I could have missed such a valuable lesson.)