httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: [users@httpd] Serving partial data of in-memory common data set
Date Tue, 28 Jul 2009 20:51:57 GMT
Jonathan Zuckerman wrote:
> On Tue, Jul 28, 2009 at 3:37 AM, S.A.<> wrote:
Concurring with Jonathan about the free advice and the tenuous relevance 
to the main list topic, I'd nevertheless want to try to contribute.

My summary of the issue :
- there are N clients accessing the site
- each client is authenticated, with a client-id of some kind
- they all request originally the same URL
- the server however returns a page to each client that can be 
different, based on a server-side client profile, selected as per the 
- the returned page is different, because it includes for each client, a 
different mixture of "items" in the page, based on the client profile
- each client gets a different selection of i items, but these i items 
are picked among a grand total of I items, which are themselves always 
the same
- you would like to cache at least part of these I items in memory, to 
speed up the responses to the clients

You haven't given us any hard numbers, like how many clients there are, 
how concurrently they access the server, how many I items there really 
are, how large each I item is, how fast the server is, how much memory 
it has, or anything of the kind.
You have mentioned that some of the items I were "media", which I 
personally tend to associate with "large", byte-wise.

My very first reaction would be to ask myself if it is all really worth 
it.  Caching in memory, no matter how it's done, has a cost.  A cost in 
design, complexity, and in pure cache management.
Modern operating systems already cache disk data.  So if a same "object" 
is accessed frequently in a short period of time, it will already be in 
the practice cached in memory buffers by the OS.  Below the OS level, 
good disk controllers also cache frequently accessed data.  Below the 
controllers, disks themselves cache data in cache memory.
Caching it yet again, with  a different piece of software, may just add 

An additional aspect is that, if some of the objects are large, and your 
server has limited memory, caching many such objects may fill up the 
physical memory, and cause the system to start swapping, which would 
really have the opposite effect to what you're looking for.

On the other hand, for Apache to access an object on disk, requires on 
the part of Apache quite a bit of work; all the more work the deeper the 
object resides in the "document space", because Apache needs to "walk" 
the directory hierarchy, all the while checking access and other rules 
at each level.  So by organising your objects smartly on disk, so as to 
minimise the work Apache has to do to find it and return it, you may 
gain a whole lot of processing time.

And servers nowadays are cheap. For the time and money you'd spend 
studying the best caching scheme, you could easily buy an extra server 
with terabytes of disk space and gigabytes of ram to use as I/O cache.

So basically what I am saying, is : try it, without any clever caching 
scheme, but with a clever organisation of your data and an efficient 
Apache configuration.  That /may/ show a problem and a bottleneck, which 
you can then tackle on its own merits.  On the other hand, it may show 
no problem at all.

A lot of work has gone into Apache, to make it as efficient as possible 
to serve content of all kinds.  There are thousands of Apache sites 
handling thousands of clients, and a lot of content.
Do not spend a lot of time ahead of time, to solve what is maybe a 
non-existent problem.  As someone said a long time ago : premature 
optimisation is the source of much evil.

The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:> for more info.
To unsubscribe, e-mail:
   "   from the digest:
For additional commands, e-mail:

View raw message