directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Karasulu" <>
Subject Re: [ApacheDS] Default partition design ideas (was: Re: [ApacheDS] Going to need to implement a splay tree)
Date Sat, 02 Feb 2008 17:00:42 GMT
Howard, thanks much for this information; it really helps and is extremely
valuable to us.  We're very lucky and grateful to have you here.

More inline ...

On Feb 2, 2008 3:23 AM, Howard Chu <> wrote:

> Alex Karasulu wrote:
> > On Jan 30, 2008 8:00 AM, Emmanuel Lecharny <
> > <>> wrote:
> >     Those long must be fetched
> >     quickly, as we will always get an entry by its ID. Using a BTree for
> >     that is time consuming, as fetching an entry will be done in
> O(log(N)).
> > You're absolutely right a hash would be much better. We don't need to
> > sort the ID's.
> Way back in the OpenLDAP 2.1 days we used hashes for our indexing in
> back-bdb.
> But we found that B-Trees still performed better, even though index
> lookups
> have nearly zero locality of reference. The problem is with large DBs, the
> hash tables grow too large to fit entirely in cache. Once the table grows
> past
> a certain size, you can no longer directly reference the records; there's
> a
> lot of expensive paging in/out that needs to be done. With a B-Tree, the
> number of internal pages in the tree is still very small relative to the
> total
> number of data pages, so you get a lot of cache reuse referencing those
> pages.
> So we switched everything to use B-Trees in OpenLDAP 2.2...
> Hashing is faster *in theory*, but in practice it loses out.

I vaguely remember your mentioning this at the LDAP conference.   I just
forgot :(.  Thanks for reminding us.  We'll just stick to using a B-Tree.

>     We should use a Hash to speedup entries retrieval (of course, at the
> >     risk that the worst case scenario can transform a search operation
> to
> >     cost N read, but this is unlikely...). The only point to consider is
> >     that this ID is incremental, so the hash function must be chosen
> >     carefully.
> A good hash function is one that evenly distributes the input keys across
> the
> entire hash table. This makes hashing extremely cache-unfriendly when
> doing
> sequential traversals of a database, or sequentially bulk-loading.

Very good point about these opposing factors.  So I figure OpenLDAP just
uses the cache in the underlying B-Tree instead of managing some kind of
separate entry cache?

> >     As the data are moved frequently, even on read,
> >     this will increase the cost of searching so much that it will kill
> the
> >     main advantage of LDAP vs RDBMS. So, yes, we can use Splay trees in
> >     memory, but we can't store splay trees on disk.
> > Yeah I agree. We can use splay trees for caches or for these low
> > duplicate count records.
> You will find that anything that turns memory read operations into memory
> write operations will scale very poorly in a multiprocessor system.

I guess this is due to synchronization. A splay based cache then defeats
itself since it requires a splay operation (memory write) on every lookup.

> Yeah they're great ideas. We just need to have a solid SLAMD lab and
> > start testing these ideas out. I got the machines:
> >
> > 9 load injectos
> > 1 SLAMD Server
> > 1 beefy server for running ApacheDS
> >
> > We just need someone to step up and help us manage this environment.
> >
> > Any volunteers would be appreciated.
> Wish I could shake some more time loose right away, this is the really fun
> part. ;)

Hehe it's fun for me too.  We have soooo many low hanging fruits that can be
picked at in ApacheDS that this would be fun and rewarding.

No worries though Howard.  We'll get there and can better collaborate
together.  In the meantime I'm going to see what I can do to get us closer
to a usable environment.

Thanks again,

View raw message