The revenge of rsync
This is a repost of a text first published on June 2013, in my previous blog, erroneousthoughts.org
(now decommissioned). It illustrates how just one tiny character, can make a world of difference…
Just the day after wrote about rsync
, I made one of my biggest programming blunders—and precisely while using rsync
.
I use a simple python script to automate my backups.1 One of the things it does, is synchronise the contents of several large folders with their counterparts in an external hard drive. For instance, say you have five big folders, a
, b
, c
, d
and e
on your local machine, that you need to keep synchronous copies of. The way I use the script to solve this problem is to run a command of the form
$ rsync <options> <local_folder> <external_hdd>/backups
for each local folder to be backed up. The way this is supposed to work is that rsync
will create folders a
, b
, etc inside the backups
folder, in the external hard drive. In addition, one of the options to rsync
I was using was ––delete-during
, which is supposed to delete from the destination folder files that are not found in the source folder. This is to avoid keeping a backup of files I had genuinely deleted.
So far so good: everything works as expected. But now suppose that you make a small mistake, and instead of d
, you write, in the list2 of folders to be backed up, d/
. Seems fairly innocuous, right? Wrong! If you go back to the fine manual, you’ll see that
$ rsync <options> d <external_hdd>/backups
works as you’d expect—it creates a folder backups/d
in the external hard drive if one does not already exist, and keeps it synchronised with the local one. But
$ rsync <options> d/ <external_hdd>/backups
d
folder to the backups
folder, regardless of whether the latter contains a subfolder named d
.
But it gets worse. How? Remember that the one of the options to rsync
is ––delete-during
which deleted from the destination folder anything that is not present in the source folder. And using d/
as the source in the rsync
command means that the destination folder is no longer backups/d
, but rather backups
, which means anything in there that is not contained inside the source folder d
will be deleted. Like, for instance, all the folders a
to c
backed up previously. Or anything else in the backups
folder that wasn’t also in the d
folder. And thus, I lost backups dating almost an year back. And the only reason this is not an irreparable mistake is because I, being the paranoid nut that I am, had the really important stuff—think password files and encryption keys—backed up in another location.
Even so, for someone who’s been doing this for years, to make such a blunder… it is—to put it gently—a humbling experience. The solution of course is to make sure, before running the rsync
command, that the source folder does not end with a slash. As so much else in programming, it’s obvious with hindsight…
March 19, 2024. Got feedback? See the contact page.