lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harry Foxwell <>
Subject lucene performance question
Date Sun, 02 Mar 2003 02:49:01 GMT
I have a project for which I want to characterize Lucene query performance
on different size archives of my XML files.  I have created archives
and indices of 1000, 2000, 4000, 8000, and 16000 XML files (average
file size about 10K) generated from
my DTD and containing mostly random string content in the simple
elements.  I run multiple tests with different random content in
each in the archive, timing each of three diffenent queries:

   query 1: Field1:stringA
   query 2: Field1:stringA Field2:stringB
   query 3: Field1:stringA AND Field2:stringB

the time to complete query 1 increases with archive size, but the
subsequent query 2 and query 3 times are ALL about the same
(generally less than 1 sec, on a Sun Ultra 60 with 2 450 MHz
processors & 512 MB memory, running Solaris 9, Java 1.4,
Lucene 1.2) regardless of archive size.

I expected the time to complete query 2 and 3 to also increase
with archive size, but as I said it remained constant.  What
is Lucene doing (caching?) to make this happen?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message