subversion-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Zhakov (JIRA)" <>
Subject [jira] [Updated] (SVN-4554) Wrong file length with PLAIN representations in FSFS
Date Mon, 16 May 2016 09:31:12 GMT


Ivan Zhakov updated SVN-4554:
    Fix Version/s: 1.8.16

> Wrong file length with PLAIN representations in FSFS
> ----------------------------------------------------
>                 Key: SVN-4554
>                 URL:
>             Project: Subversion
>          Issue Type: Bug
>          Components: libsvn_fs_fs
>    Affects Versions: 1.8.x
>            Reporter: Stefan Fuhrmann
>            Assignee: Stefan Fuhrmann
>            Priority: Blocker
>             Fix For: 1.8.16, 1.9.0
> FSFS allows PLAIN, i.e. non-deltified non-compressed, representations to store the content
length as 0 as it matches the on-disk size.  Up to and including 1.8.x, there is no restriction
on the representation type for that omission.
> In practice, however, it can be difficult to decide whether a 0 value represents an omission
or an actually empty file: A self-deltified empty file has a length of 0 but an 4 byte on-disk
size.  When representations are read, their header tells us whether it is indeed a PLAIN or
a DELTA representation and that is enough to resolve any ambiguity.
> The problem occurs when we omit the length value for file contents and call {{svn_fs_file_length()}}
on it.  FSFS will report the length as 0 and that causes e.g. 'svnadmin dump' to write broken
dump files where the skipped / empty contents does not match the checksum.
> Up to 1.7.x, we used this omission rule only for hash data, i.e. props and directories,
never for file contents.  Thus, there is no problem with these Subversion releases.  Starting
1.9.0 and up, the "structure" document explicitly restricts the omission to property reps.
 Furthermore, 1.9+ will not omit length values at all.
> 3rd party implementations like SVNkit (?) may have produced instances of omitted length
values for file contents, though.  We need to handle those correctly and extend the API implementation
> Moreover, 1.8.x generalized the rep sharing mechanism.  If a file contents happened to
match a property representation, e.g. "END\n", it would now use the property representation.
 The latter is PLAIN by default and stores a 0 length value in the rep cache.  Hence, the
file contents rep will also report a 0 length.
> Reproduction sketch (requires 1.8.x):
> * Create repo. Keep rep sharing on and prop deltification off.
> * Add empty file, set prop on file and commit.
> * Remove prop on file and commit.
> * Set file contents to "END\n" and commit.
> * run 'svn ls -v' on the parent folder => file length is shown as 0
> The following things need to be fixed:
> * Don't omit the length value - even for properties.
>   That prevents new instances caused by incoming data. (already fixed in 1.9; fix for
1.8 still needed.)
> * Update the "structure" document with info when the omission is safe.
> * Compare the size and length values of the rep returned by the rep cache   with the
data of the new rep.  Only replace new with old if those match. The prevents new instances
caused by the rep cache.
> * Fix {{svn_fs_fs__file_length()}} to return 0 lengths only for files that   are known
empty.  This will be the actual bug fix.

This message was sent by Atlassian JIRA

View raw message