lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: ParallellMultiSearcher Vs. One big Index
Date Tue, 18 Jan 2005 19:32:09 GMT
Ryan Aslett wrote:
> What I found was that for queries with one term (First Name), the large
> index beat the multiple indexes hands down (280 Queries/per second vs
> 170 Q/s).
> But for queries with multiple terms (Address), the multiple indexes beat
> out the Large index. (26 Q/s vs 16 Q/s)
> Btw, Im running these on a 2 proc box with 16GB of ram.
> 
> So what Im trying to determine Is if there is some equations out there
> that can help me find the sweet spot for splitting my indexes.

What appears to be the bottleneck, CPU or i/o?  Is your test system 
multi-threaded?  I.e., is it attempting to execute many queries in 
parallel?  If you're CPU-bound then a single index should be fastest. 
Are you using compound format?  If you're i/o-bound, the non-compound 
format may be somewhat faster, as it permits more parallel i/o.  Is the 
index data on multiple drives?  If you're i/o bound then it should be 
faster to use multiple drives.  To permit even more parallel i/o over 
multiple drives you might consider using a pool of IndexReaders.  That 
way, with, e.g., striped data, each could be simultaneously reading 
different portions of the same file.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message