directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <>
Subject Re: Add perf issues
Date Mon, 10 May 2010 08:03:16 GMT
On Sun, May 9, 2010 at 11:51 PM, Alex Karasulu <> wrote:

> Hi Emmanuel,
> On Sun, May 9, 2010 at 5:40 PM, Emmanuel Lecharny <>wrote:
>> Hi,
>> while doing some perf test on the Add operation, I'm getting blocked after
>> having added around 3800 entries. After investigations, I found that some
>> index are very expensive to update :
>> - the ObjectClass index quickly get saturated, and adding a new entry into
>> it cost a hell of a time.
>> - same problem with the OneLevel index
>> - Same problem with the SubLevel index
>> - same problem with the RDN index
>> - from time to time, the UUID index takes 300 ms to synced, but that's
>> random and it may be due to some page split
>> - all the other index behave perfectly well, assuming that they are not
>> impacted, as we don't add any entry into them.
>> Ok, now, a blind guess is that those indexes, except the RDN index, will
>> all contain a reference to the newy created entry, assuming I'm adding N
>> entries whith a Dn:cn=test<N>,ou=system. They will be all in the same level
>> (thus the pb with the OneLevel and SubLevel index), with all having the same
>> OC (top and person) thus the problem with the OC index. I'm a bit more
>> surprised by the RDN index problem, except if we consider that the ou=system
>> RDN will point to all its children, then t makes sense that we have the same
>> problem.
>> Basically, it seems that all the references are added in the same page
>> which is deserialized and serialized for each addition, growing and growing.
>> Do we have a size limit for a page after which it is moved to use a sub-tree
>> ?
> No there is no size limit after using a sub-tree for a key's values with
> over the 512 (I think) limit which switches in memory key values over to the
> b+tree.
>> What's the strategy when we will have millions of entries under the
>> 'person' ObjectClass, or millions of entries in a flat directory?
> Logic was added to the index implementation to switch data structures for a
> key having more than a certain threshold of values. If I remember correctly
> the default for the threshold was 512 values for the same key. So if we have
> an ou index and the 'Engineering' key has >512 values, the data structure
> switches from a Collection object to a BTree which only stores values of
> this key.
>> I thought this issue was fixed a while ago, and I'm a bit surprised hat it
>> still present in the server, making it totally unusable.
> Yeah this should have been solved a long time ago you are right. It might
> have failed and the problem may have creped back into the picture again.
> Let's just make sure the data structure switch is in fact taking place after
> passing the threshold.
>> Or I'm missing a configuration parameter (like the max number of element
>> in a page before a sub-tree is created).
> Don't think so. It defaults to some threshold value. Perhaps the threshold
> needs to be dropped too but it should not be this bad.

I'm starting to look into the switch over to using secondary BTree's for
duplicates after the threshold is reached to make sure this is working
properly.  I remember having a hard time writing a test case to make sure
that this switch over occurs before because this switch over is an
implementation detail internal to the index implementation which is not
noticeable to the outside world (callers).

I guess I can expose a flag to check which mode the Index is in on the
implementation class to see if a valid BTree is being used.

Alex Karasulu
My Blog ::
Apache Directory Server ::
Apache MINA ::
To set up a meeting with me:

View raw message