subversion-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Phippard <markp...@gmail.com>
Subject Re: Branching slow 1.8.11 https
Date Tue, 31 Mar 2015 12:43:59 GMT

> On Mar 31, 2015, at 8:13 AM, Johan Corveleyn <jcorvel@gmail.com> wrote:
> 
>> On Tue, Mar 31, 2015 at 2:19 AM, Johan Corveleyn <jcorvel@gmail.com> wrote:
>>> On Sun, Mar 29, 2015 at 7:57 PM, Johan Corveleyn <jcorvel@gmail.com> wrote:
>>>> On Sat, Mar 28, 2015 at 5:09 PM, Bert Huijben <bert@qqmail.nl> wrote:
>>>> 
>>>> 
>>>>> -----Original Message-----
>>>>> From: Johan Corveleyn [mailto:jcorvel@gmail.com]
>>>>> Sent: vrijdag 27 maart 2015 22:03
>>>>> To: users@subversion.apache.org
>>>>> Subject: Branching slow 1.8.11 https
>>>>> 
>>>>> Does the following ring a bell for someone?
>>>>> 
>>>>> Recently upgraded our server (on Solaris 10 SPARC) from 1.5.4 to
>>>>> 1.8.11 (CollabNet package). Some time after that, we discovered that
>>>>> branching was very slow. I'm talking about pure server-side branching
>>>>> ('svn copy $URL/trunk $URL/branches/br1'). I'm testing with a 1.8.11
>>>>> client (tried both from same machine as the server, and from another
>>>>> machine on the LAN (100 Mbit)).
>>>>> 
>>>>> - Branching trunk (containing many directories and files): 6-8 minutes
>>>>> - Branching a subfolder of trunk: 20-30 seconds (still very slow)
>>>>> - Branching a single file is fast (< 0.5s or so).
>>>>> 
>>>>> So it seems the performance degrades depending on the depth or size of
the
>>>>> tree.
>>>>> 
>>>>> Now, it gets more interesting:
>>>>> - The resulting rev file on the server is always very small (as it
>>>>> should be, it contains only a lightweight 'copy' of the trunk node).
>>>>> - Our repos is currently served via https (Apache 2.2.29).
>>>>> - Branching with file:/// urls is fast (branching trunk takes 0.6s).
>>>>> - When starting an svnserve instance serving the same repository, and
>>>>> branching with svn:// urls, it's fast as well (also 0.6s).
>>>>> - We reproduced it on a copy of the production repo.
>>>>> - Experimenting with the test copy, we found that
>>>>> $repos/dav/activities.d contains ~2000 files. When we clear that
>>>>> directory, the branching times go down by more than half (~2 minutes
>>>>> for trunk, ~10s for subdir of trunk --- i.e. still slow, but it
>>>>> definitely has an impact).
>>>>> - With a 1.7 client connecting with neon, the problem is the same.
>>>>> - During the 'svn copy', an httpd child consumes a lot of cpu (around
>>>>> half a core).
>>>>> - There is no authz configured for this repo (SVNPathAuthz off).
>>>>> - Backend is still in 1.5 format (we have not run svnadmin upgrade
>>>>> yet, a dump+load is planned in a couple of weeks).
>>>>> 
>>>>> So it seems clearly mod_dav_svn related (and not for instance related
>>>>> to the FSFS backend).
>>>>> 
>>>>> I don't think we have anything special in our httpd config:
>>>>> [[[
>>>>>   <Location /test_svn>
>>>>>      SVNInMemoryCacheSize 131072
>>>>>      SVNCacheFullTexts on
>>>>>      SVNCacheTextDeltas on
>>>>>      SSLRequireSSL
>>>>>      AuthName "TEST Subversion Repository"
>>>>>      AuthType Basic
>>>>>      AuthBasicProvider ldap
>>>>>      AuthBasicAuthoritative off
>>>>>      AuthLDAPURL "ldap://redacted:389"
>>>>>      AuthLDAPBindDN "redacted"
>>>>>      AuthLDAPBindPassword redacted
>>>>>      Require ldap-group redacted
>>>>>      DAV svn
>>>>>      SVNPath /path/to/test_repos
>>>>>      SVNPathAuthz off
>>>>>   </Location>
>>>>> ]]]
>>>>> 
>>>>> Any ideas?
>>>>> Why the cpu usage by the server, what's it doing?
>>>>> What is the dav/activities.d directory for? How come it contains so
>>>>> many files? Is it ok to purge the old files from that directory?
>>>> 
>>>> Httpd's mod_dav was updated in some recent version to do a full lock traversal
on copies and moves. I think we already applied some optimizations, but the real fix would
be that mod_dav shouldn't do this work (which our repos layer already does).
>>>> 
>>>> I'm not sure which release we applied the first set of optimizations.
>>> 
>>> Thanks for refreshing my memory.
>>> 
>>> So the problem is known as issue #4531 (server-side copy (over dav)
>>> uses too much memory) [1]. The memory usage issue has been fixed in
>>> SVN 1.8.11 and 1.7.19 (see CHANGES), but a performance problem remains
>>> (copy is no longer O(1), but depends on the size of the tree being
>>> copied). That's a direct violation of one of Subversion's "old selling
>>> points" vs. CVS: that branching / tagging is O(1). Branching / tagging
>>> taking several minutes brings back "fond memories" from CVS' days.
>>> 
>>> As Philip pointed out in his last comment on #4531 [2]: "This issue is
>>> related to a change in mod_dav in 2.2.25 to fix PR54610 which
>>> added a walk over the copy source looking for lock tokens." (also
>>> released in 2.4.5; so both httpd 2.2.25+ and 2.4.5+ are affected --
>>> older httpd's won't have this problem I guess).
>>> 
>>> Again quoting Philip: "Apache knows in advance that the walk is
>>> redundant in cases such as Subversion's URL-to-URL copy but Subversion
>>> cannot avoid the read access. We should attempt to fix mod_dav to
>>> avoid the walk where possible."
>>> 
>>> So my hope rests with Philip and others who might have the necessary
>>> knowledge to fix this in mod_dav. It's really not acceptable that
>>> branching / tagging (or I'm guessing also: moving a large tree with a
>>> server-side move) takes several minutes.
>>> 
>>> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4531
>>> [2] http://subversion.tigris.org/issues/show_bug.cgi?id=4531#desc12
>> 
>> I think I've found a workaround: it seems the tree walk by mod_dav is
>> avoided when the request has a header Depth with value 0. I've tried
>> adding
>> 
>>    <If "%{REQUEST_METHOD} == 'COPY'">
>>        RequestHeader set Depth 0
>>    </If>
>> 
>> to the Location block of SVN, and the copy is fast again! And the good
>> thing is: it's still a fully recursive copy :-) (otherwise it wouldn't
>> be much of a workaround).
>> 
>> 'svn copy' time for a very large tree (artificially generated with
>> ~50000 folders and ~250000 files) is now down to 1,5 seconds (still
>> three times slower than the same via file:/// or svn://, but good
>> enough, and not O(sizeof(tree)) anymore).
>> 
>> Is this workaround safe? Thoughts?
>> It might even be something that can be exploited by our client, when
>> 'svn copy'ing ... (though a "normal" server-side fix for this problem,
>> within the normal workings of mod_dav, would of course be better
>> still).
> 
> Seems this workaround is pretty OK for now (apparently the subversion
> code on the server ignores the Depth:0 for COPY requests, so the copy
> is handled like a normal recursive copy).
> 
> Bert suggested on irc to make the setting of the header also dependent
> on the useragent string.
> 
> For completeness: I'm now no longer seeing the 1,5 seconds time for
> copying over dav. Today it's more like 0,5 - 0,7 seconds, i.e. the
> same as with file:// and svn://. Maybe something was slowing down my
> network temporarily yesterday evening.
> 
> -- 
> Johan

Are we going to change the client to send this header? This seems like a very significant
regression in our primary "promises" to allow it to wait for a mod_dav fix that might never
even happen.

Mark
Mime
View raw message