So it happened that the disk hosting our code repository developed some bad blocks. Must be contagious or something, this is already the third system within a month…
I noticed this, when I received some strange messages from cron. Basically my backup script started failing when svnadmin dump could not read a file. On closer examination, one of the DIFF files in the svn repository was damaged. This was bad, as the content is stored as a series of diffs, and a particular branch of the repository became unavailable.
The Subversion aka SVN repository is located on /var a separate partition. Since /var receives a lot of writes, I decided to migrate the whole /var to a different partition to avoid further bad blocks.
Armed with rsync and ddrescue I started the recovery process. You know the drill; format the new partition, rsync the content and ddrescue the damaged file. But this time ddrescue let me down. Now it was time for backups to step in.
Satisfied with a quick solution – restore from backup, I stopped all services which were using /var, unmounted it, mounted the new partition as /var and restarted all previously stopped services.
To my surprise the SVN still did not want to cooperate. I was getting a bad feeling about this, which was confirmed after examining all eleven weekly backups. In all of them the file was corrupted.
How did I not notice this earlier? Well, it was time to think about a new strategy.
I tried to run svnadmin recover mode, but that didn’t yield anything.
I don’t know the internal workings of SVN, but what I figured out is:
- content is stored as diffs
- each commit is one big file containing all changes
- there is file containing the number of the current revision
Luckily I knew exactly what files were committed in that particular diff, even better for me, it was a commit of only a few new big files! My best chance would be to recreate the corrupted commit.
When I manipulated the revision counter, the SVN server was tricked into thinking that the ‘current’ version was whatever I set it to be.
So I rolled back the revision counter to the revision before corruption, checked out the branch with TortoiseSVN into a new location, included the ‘new’ files and committed them to the repository. Re-setted the revision counter to original and voila, everything started working!
I guess this time I got more luck than brain. What would you do?
All product and company names herein may be trademarks of their respective owners.