lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Benchmarking Lucene
Date Mon, 23 Nov 2015 20:04:39 GMT
Which JVM vendor :)  There are not so many, unfortunately...

I run nightly benchmarks for Lucene, which are visible at

We use this to catch accidental performance regressions... the sources
for all of this are at but
running them yourself can be tricky.  They index and search
Wikipedia's English export.

Lucene is definitely JVM/GC bound in many cases, e.g. when the index
is "hot" (fully cached by the OS in free RAM).

I'm not familiar with Dacapo...

I'm not sure how aggressively users upgrade ... but I believe most
users use Lucene via Elasticsearch or Solr.

Mike McCandless

On Mon, Nov 23, 2015 at 2:42 PM, Sanjoy Das
<> wrote:
> Hi all,
> I work for a JVM vendor, and we're interested in obtaining / creating
> a set of Lucene benchmarks for internal use.  We plan to use these for
> performance regression testing and general performance analysis
> (i.e. to make sure Lucene performs well on our JVM).  I'm especially
> interested in benchmarks that demonstrate opportunities for
> improvements in our JIT compiler.
> While I imagine that the lucene/benchmark/ directory is probably the
> right place to start, I have a few high-level questions that are best
> answered by people on this mailing list:
> - Are there realistic Lucene workloads that are bottle-necked on the
>   JVM's performance (JIT, GC etc.) and *not* e.g. disk / network IO?
>   If so, what are some examples?
> - How relevant are the Dacapo "luindex" and "lusearch" benchmarks
>   today?  Will porting them to the latest version of Lucene give me a
>   benchmark representative of modern Lucene usage, or has Lucene's
>   performance characteristics evolved in fundamental ways since Dacapo
>   was published?
> - What is the distribution of Lucene versions in production
>   deployments?  Do users tend to aggressively upgrade to the "latest
>   and greatest" Lucene version, or is there usually a non-trivial lag?
> Any other information that you think is useful or relevant is
> welcome.
> Thanks!
> -- Sanjoy
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View raw message