subversion-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Stafford <>
Subject RE: not storing diffs of binary files
Date Tue, 09 Aug 2011 17:19:11 GMT
Thanks everyone for the responses.  To check my understanding, and to give half a conclusion

Every revision apart from the very initial revision of a file is stored as a delta against
some previous version.  Subversion would typically probably use the least disk space *if*
each revision was stored as a delta against the immediately preceding revision.  But that
would be really slow for reconstructing the 1000th revision.  So instead, each revision is
stored as a delta against a base of flip-rightmost-1.

This generally gives a balance between space used up and time to recreate any given revision
of the file.

OK, how does all that sound so far?

Knowing this I was hoping I'd look again and understand what was going on with my repository
with successive zips of my database data checked in.  Not quite...

I can see that the deltas aren't necessarily against the immediately preceding version - in
fact with 15 revisions it's satisfying/reassuring to see them doing exactly as billed in the
skip deltas document.

The bit I still can't reconcile is the difference in the delta size between xdelta standalone
(small) and the delta stored by subversion (large - almost the size of the file itself sometimes).

I've checked in various versions of my database data zipped.  Some with a month of changes
between each revision, some with the most trivial change possible between revisions.

For a trivial change: 
xdelta delta size = 300KB, subversion db\revs file size = 300KB

For a month of database edits:
xdelta delta size = 3 or 4MB, subversion db\revs file size = 50MB

Obviously for fair comparison I'm only picking on revisions where subversion did delta against
the immediately preceding revision.

So does subversion (version 1.6.11) use an old, not quite so good, xdelta?  Or is it just
that it applies xdelta after its already done some format manipulation on the file, which
then makes it less delta-able?  Or something else...


-----Original Message-----
From: Andreas Krey [] 
Sent: 08 August 2011 13:44
To: Mark Phippard
Cc: Daniel Shahaf; Jon Stafford;
Subject: Re: not storing diffs of binary files

On Mon, 08 Aug 2011 16:28:42 +0000, Mark Phippard wrote:
> All revisions are "deltified" but some are deltified against an empty
> stream.  I do not know if the diagram is accurate, but revisions 1 and 2 of
> the file are both against the empty stream.

Revision 0 goes as a regular revision in that diagram, according to the text.
This way every rev but the first is deltified; would be stupid otherwise.
(And the first is deltified against the empty stream.)


"Totally trivial. Famous last words."
From: Linus Torvalds <torvalds@*.org>
Date: Fri, 22 Jan 2010 07:29:21 -0800

View raw message