lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: CachingWrapperFilter: why cache per IndexReader?
Date Sun, 13 Jan 2008 02:47:32 GMT
I didn't follow this thread, but let me point out that certain critical pieces of code executed
during search are synchronized (FileIndexInput or some such).  Thus, with a pile of search
threads sharing the same IndexSearcher, this can cause a bottleneck.  In such cases I thin
khaving multiple IndexSearchers will actually be beneficial, as it will reduce that contention.

On 1 thread vs. 2 vs. 4, I don't have the explanation.

Sematext -- -- Lucene - Solr - Nutch

----- Original Message ----
From: Toke Eskildsen <>
Sent: Friday, January 11, 2008 5:34:48 AM
Subject: Re: CachingWrapperFilter: why cache per IndexReader?

On Tue, 2008-01-01 at 15:06 -0500, Mark Miller wrote:
> Perhaps, in some esoteric case, multiple readers is the right idea 
> (monster, monster, super IO system, static index?? maybe...)...but 
> unless you have run into this case and have some data to show it, I 
> would stick with what the community currently designates as best 
> practice. Otherwise, your really just getting ahead of yourself.
> searcher has been scaled to some pretty huge sites. More than one 
> searcher is usually responsible for very slow performance.

I took the opportunity to make a small test. Our index is 37 GB with 10
million records. I tested on a dual-processor Xeon machine with 8 GB of
RAM. The test-material was logged queries from our production system.
I tried with shared searcher between threads and a searcher for each

As for shared searcher vs. individual searchers, there was just a
penalty for using individual searchers. However, all our queries are
standard ranked searches - no sorting on title or similar, so I see no
reason to disbelieve that multiple searchers often results in poor

As for threading, I noticed something strange: On the dual-core
two threads gave better performance than one, while 4 threads gave the
same performance as one. I tried it with standard harddisks (two 15.000
RPM in RAID 1) and with Flash drives (two 32GB Samsung in RAID 0) - in
both cases the picture was the same. I would have expected the
performance of four threads to be on par with two threads, or at least
to lie somewhere between one and two threads?

There's some graphs for my measurements at
should anyone be interested.

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message