directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lecharny <>
Subject Re: Various questions
Date Tue, 06 Jun 2006 20:35:28 GMT
Quanah Gibson-Mount a écrit :

> Well, I see ldap servers expose data to the world all the time.  
> Pretty much any university I send random queries to does so.  

This is university ecosystem. In business ecosystem, trust me, they 
don't !!! (of course they do, just because security is something strange 
and not understood very well, because it costs money, and cost killer 
are really efficient at killing efficient administrators...:)

> @ Stanford, we allow users to affect the "visibility" of their data, 
> with 3 settings:
> "world" -- Avaliable to anyone, including anonymous
> "stanford" -- Available only to those people who have authenticated as 
> being from Stanford
> "private" -- Not visible to anyone by normal means (specific 
> applications get by this)

Seems fair.

> Since there is a fair amount of data then available to anyone who 
> wants to run a query because of policy, I do try my best to do due 
> diligence and cut down on spam harvesting runs.  We do have a result 
> limit on the server, but the people I've run across are savvy enough 
> to use batched queries of different ranges to effectively get around 
> that in at least part.

This is also a reason why companies don't want to expose data : to avoid 
those kind of bastards trying to overload the server.

> People also like to be able to use their email clients to get 
> information from the directory servers, and very few of them (only one 
> that I've found) support SASL/GSSAPI binds, which is the only 
> authentication method we allow (no username/password).

Yeah, that's true. Other option is ssl tunneling. Obviously a good 
solution, when not supporting SASL/GSSAPI.

>> well, in production, loading a server ris not something you do very
>> often. You may need to restore a crashed database, or reload a database
>> which structure has change, but this is definitively not a real concern.
>> Load once, use many.
> I think that's a good thought in theory, and is what I thought too. 
> However, I run 4 environments (dev, test, uat, and production).  

I was used to work for client with 5 env : add a pre-prod. Only two of 
those env have a fully loaded base, prod and pre-prod. It does not 
change every single day. Dev, test and uat usually have a subset of the 
prod base.

> We have a custom schema that we modify a few times a year, and those 
> modifications are usually large enough to warrant a complete reload of 
> the data that is generated from our RDBMS for the ldap servers.  

Funny ! Schema are not supposed to change, but they *do*. They just 
do... I worked for a company that changed its schema every month "Oh, we 
have forgot that attribute... Could you add this class? ... Btw, the 
previous attribute is useless...". In those case, metadirectory comes to 
the rescue. Not a good solution, anyway. However, for a 1M entries LDap 
serve, schema changes should not occurs that often, and in this case, 
you have plenty of time to do the migration, as you go through 4 envs 
before going production. My 70M entries client use pre-prod for that. 
When the server is loaded, the swap pre-prod and prod. Et voilà !

> As a part of that process, dev may be reloaded several times as bugs 
> are fixed, etc, and the same goes for test.  So I actually reload my 
> servers a bit. ;)

Should not be a burden in dev, if you use a subset of the database.

>> 3Gb is really nothing. A 15K Rpm SCSI disk is now 36 Gb minimum and cost
>> aroung 200$. Not a big deal. Better spend money of memory sticks rather
>> that on high performance disks :)
> Yeah, my concerns here may be more specific to OpenLDAP and the use of 
> BDB. When bulk loading, it is quickest to have enough BDB cache as the 
> entire size of your database (3.8GB in the case above).  On Solaris 
> SPARC, I found that the only good way to get performance was to use a 
> shared memory region (Linux doesn't require that), which means that I 
> have to have as much memory available as BDB cache on the system, and 
> memory is sadly not so cheap as disk.

Fair enough. I incline towards the opinion that that the best Ldap 
database is the database which is totally cached in memory. It's not 
always possible ( again, for my 70M client...) But I will say that 99,99 
% of all ldap servers in the world are using less than 100 entries :). 
However, we will see an extension to Ldap server usage in the next few 
years, and I won't be surprised if we have servers where the number of 
entries is two or three orders of magnitude higher (think about 
WebServices, RFID, etc).

Man, this is an interesting discussion ! Could last for hours... Worth a 
couple of beers :)


> --Quanah
> -- 
> Quanah Gibson-Mount
> Principal Software Developer
> ITS/Shared Application Services
> Stanford University
> GnuPG Public Key:

View raw message