subversion-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bert Huijben" <b...@qqmail.nl>
Subject RE: Almost repetitive repository corruption
Date Sun, 29 Dec 2013 22:58:01 GMT

> -----Original Message-----
> From: Igor Varfolomeev [mailto:i3v@mail.ru]
> Sent: zondag 29 december 2013 23:00
> To: users@subversion.apache.org
> Subject: Almost repetitive repository corruption
> 
> Hi all,
> 
> I’ve just ran into a weird bug which damaged my svn repository. I still
> don’t understand what exactly was wrong, so, I don’t know how to
> describe it in a clear and simple manner, sorry… I’ll just try to describe
> all the symptoms I’ve experienced. I’ll use real file names, since I
> wasn’t able to reproduce this bug on synthetic test repository.
> 
> *SETUP*
> Most simple single-user, single-PC setup. Local repository.
> First svn version: “Subversion command-line client, version 1.8.5.”.
> Windows 7 x64
> Antivirus: Kaspersky Endpoint Security 10
> 
> *THE STORY*
> The story began, when I ran into some sort of error message, while
> trying to commit r3349.
> After a bit of struggling, I’ve realized, that my repository got broken
> after previous commit (r3348). Nasty thing is that previous commit
> finished without any error message.
> 
> *SYMPTOMS*
> **svn verify**
> Output ends like this:
> <….>
> * Verified revision 3346.
> * Verified revision 3347.
> svnadmin: E160004:
> Corrupt node-revision '4d-610.2-2392.r3348/35659066'
> svnadmin: E160004: Found malformed header '' in revision file
> 
> **svn checkout**
> When I try to checkout a new working copy, I receive similar
> message:
> <…>
> W:\testCO\Binar\Matlab\deploy
> W:\testCO\Binar\Matlab\deploy\x64
> W:\testCO\Binar\Matlab\deploy\x64\Binar_x64.prj
> W:\testCO\Binar\Matlab\deploy\x64\Binar_x64
> W:\testCO\Binar\Matlab\deploy\x64\Binar_x64\distrib
> Corrupt node-revision '4d-610.2-2392.r3348/35659066'
> Found malformed header '' in revision file
> 
> **svn Repository Browser**
> When I navigate to
> file:///V:/R_Matlab/Binar/trunk/Binar/Matlab/deploy/x64/Binar_x64
>  in tortoise svn repository browser, I see the same error message:
> 
> Corrupt node-revision '4d-610.2-2392.r3348/35659066'
> Found malformed header '' in revision file
> 
> Here’s a screenshot: http://sdrv.ms/1fJVuwa
> 
> *ZEROS IN DATA FILE*
> Luckily, I have a full backup (r3337). I’ve manually repeated all my
> commits up to r3347 and verified that at this state repository is OK.
> 
> Next, I’ve tried to reproduce the bug:
> 
> 1.	Firstly (“try1”), I’ve repeated same Matlab commit script
>         (Matlab simply calls svn, just like from cmd). And… «success»
>         - same bug again!
> 
> 2.	Secondly (“try3”), I’ve managed to reproduce the bug using
>         only windows cmd commands.
> 
> 3.	Thirdly (“try4” and “try5(0)”), I wrote a bat-script to
>         reproduce the same actions.
> 
> I’ve compared
> R_Matlab\db\revs\3\3348
> file for different “tries”:  (initial bug is designated as “try0”) and
> discovered a single interesting thing:
> each “3348” file has a long sequence of zero-bytes:
> 
> •	try0: 0x2201B0A to 0x2201FFF
> 
> •	try1: 0x2201000 to 0x2201FFF
>        o	try0_vs_try1_p1: http://sdrv.ms/Ju7nev
>        o	try0_vs_try1_p2: http://sdrv.ms/Ju7tmu
>        o	try0_vs_try1_p3: http://sdrv.ms/Ju7AOI
> 
> •	try3: 0x2201B11 to 0x2201FFF
>        o	try0_vs_try3_p1: http://sdrv.ms/Ju7G9g
>        o	try0_vs_try3_p2: http://sdrv.ms/Ju7HKd
> 
> •	try4: 0x2201000 to 0x2201FFF
>        o	try0_vs_try4_p1: http://sdrv.ms/Ju7OFE
>        o	try0_vs_try4_p2: http://sdrv.ms/Ju86MJ
>        o	try0_vs_try4_p3: http://sdrv.ms/Ju89ID
> 
> •	try5(0): 0x2201000 to 0x2201FFF (just like try4).
>        o	try0_vs_try5(0)_p1: http://sdrv.ms/1daKwjG
>        o	try0_vs_try5(0)_p2: http://sdrv.ms/1daKxUx
>        o	try0_vs_try5(0)_p3: http://sdrv.ms/Ju8iM5
> 
> 
> Moreover, try4 and try5 have only one single difference, two zero-
> bytes, starting from 0x21F9FFE (in case of “try5(0)”):
> http://sdrv.ms/19jmBdm
> 
> *BUG DISAPPERED*
> That’s all I have. 5 broken repositories. After that bug DISAPPEARED.
> Just like a UFO :) . I’ve launched the SAME script, with the SAME
> input data 10 more times (“try5(1)”,”try5(2)”…) – nothing – svn
> correctly commits r3348, resulting repository is valid:

	Hi,

Did you make sure you restored the db\rep-cache.db in every step. (This may make difference
then you expected)

The fact that you copy a single file two times in one commit makes me expect that this is
relevant information.

Are all the drives in your test scenario local harddisk or are some network drives involved?

	Bert


> 
> •	svn verify is OK
> 
> •	I’m able to see contents of
>         “R_Matlab/Binar/trunk/Binar/Matlab/deploy/x64/Binar_x64”
>         in tortoise svn repository browser
> 
> •	svn checkout is OK.
> 
> When I compare “revs\3348” for “try4” vs “try5(1)” the ONLY
> difference is those long sequence of zero-bytes mentioned before:
> 
> •	try4_vs_try5(1)_p1: http://sdrv.ms/1edmEdV
> 
> •	try4_vs_try5(1)_p2: http://sdrv.ms/Ju8YkC
> 
> *REPRODUCTION SCRIPT*
> The bat script, that resulted in error is quite straightforward. It simply
> copies several files. It might be not a good idea to copy modified file
> without committing it first, but still it should not result in error… The
> bat file (used in try4) is here: http://sdrv.ms/19ld4FN
> Another thing to mention is that size of files in 3348 commit is about
> 250 Mbytes….
> To my shame, my repository is both large (~30GB) and containing
> confidential data, so, I’m unable to share it :( .
> 
> All files mentioned above are in this folder: http://sdrv.ms/1jMN250
> 
> *LOKING FOR SIMILAR CASES*
> Mainly, I’ve just googled “svn: Corrupt node-revision”. It looks like
> this error message is quite common, but no one tried to understand
> it’s source. Though, there’s a “what was that?” question
> in [1](see link below).
> Moreover, it looks like no one experienced “repetitive” behavior…
> In some cases, issue was resolved by restoring revision files from
> backup[1], or using svn dump/load [3,4]. In one report [2],
> julian.foad <at> wandisco.com was using John Szakmeister's
> 'fsfsverify.py' to analyze corruption. Though, it looks like in his case,
> corruption type was quite different. In one post [4], VinnyJames
> said: “we've seen this happen during heavy load”.
> 
> 1.
> http://www.wandisco.com/svnforum/threads/38519-Commit-errors-
> Revision-files-corrupted
> 
> 
> 2.	http://thread.gmane.org/gmane.comp.version-
> control.subversion.devel/123110
> 
> 3.
> http://stackoverflow.com/questions/5543285/how-do-i-fix-a-repository-
> with-one-broken-revision
> 
> 
> 
> 4.
> http://dev-notes-to-self.blogspot.com/2009/01/fixing-corrupt-subversion-
> repository.html?showComment=1280529811361#c6899551059356251422
> 
> 
> *QUESTIONS*
> So….
> 1.	What was that? Any ideas? May it happen again?
> 2.	Any other interesting diagnostic info I can get from these
> repositories?
> 3.	Should I re-post this to subversion mailing list also? Or is it,
>         most probably, dependent on tortoise somehow?
>         Say, due to some caching?
> 
> 
> *PS*
> I’ve already posted the text above on tortoise svn mailing list:
> 
> http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=4061&dsMess
> ageId=3070808
> 
> and received a suggestion to re-post it here:
> 
> http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=4061&dsMess
> ageId=3070843
> 
> 
> 
> *PPS*
> I’m not subscribed and would appreciate being explicitly Cc:ed in any
> responses.



Mime
View raw message