So it happened that the disk hosting our code repository developed some bad blocks. Must be contagious or something, this is already the third system within a month…

I noticed this, when I received some strange messages from cron. Basically my backup script started failing when svnadmin dump could not read a file. On closer examination, one of the DIFF files in the svn repository was damaged. This was bad, as the content is stored as a series of diffs, and a particular branch of the repository became unavailable.

The Subversion aka SVN repository is located on /var a separate partition. Since /var receives a lot of writes, I decided to migrate the whole /var to a different partition to avoid further bad blocks.

Armed with rsync and ddrescue I started the recovery process. You know the drill; format the new partition, rsync the content and ddrescue the damaged file. But this time ddrescue let me down. Now it was time for backups to step in.

Satisfied with a quick solution – restore from backup, I stopped all services which were using /var, unmounted it, mounted the new partition as /var and restarted all previously stopped services.

To my surprise the SVN still did not want to cooperate. I was getting a bad feeling about this, which was confirmed after examining all eleven weekly backups. In all of them the file was corrupted.

How did I not notice this earlier? Well, it was time to think about a new strategy.

I tried to run svnadmin recover mode, but that didn’t yield anything.

I don’t know the internal workings of SVN, but what I figured out is:

  • content is stored as diffs
  • each commit is one big file containing all changes
  • there is file containing the number of the current revision

Luckily I knew exactly what files were committed in that particular diff, even better for me, it was a commit of only a few new big files! My best chance would be to recreate the corrupted commit.

When I manipulated the revision counter, the SVN server was tricked into thinking that the ‘current’ version was whatever I set it to be.

So I rolled back the revision counter to the revision before corruption, checked out the branch with TortoiseSVN into a new location, included the ‘new’ files and committed them to the repository. Re-setted the revision counter to original and voila, everything started working!

I guess this time I got more luck than brain. What would you do?

All product and company names herein may be trademarks of their respective owners.

Get your free 30-day GFI LanGuard trial

Get immediate results. Identify where you’re vulnerable with your first scan on your first day of a 30-day trial. Take the necessary steps to fix all issues.