lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Updated: (LUCENE-1594) Use source code specialization to maximize search performance
Date Thu, 07 May 2009 19:00:46 GMT


Michael McCandless updated LUCENE-1594:

    Attachment: LUCENE-1594.patch

New patch attached:

  * Specialize for the no norms cases

  * N-clause BooleanQuery of TermQuerys now handled

  * Handle "setMinimumNumberShouldMatch"

  * MUST_NOT clauses handled

  * Allow total hits to NOT be computed, and then when sorting by
    field, do a fail fast on a doc while iterating the TermDocs if the
    doc can't compete in the current PQ (discussed under LUCENE-1593)

  * Pre-replace nulls with U+0000 in StringIndex

  * Other random optimizations

Patch is small because I'm not including all generated sources (there
are too many).

This patch always pre-fills the queue w/ sentinel values.

These optimizations result is very sizable performance gains,
especially with OR queries that sort by field, do not require total
hit count (with or without filtering, deletions, scoring, etc.).  In
these cases the specialized code runs 2.5-3.5X faster than Lucene

> Use source code specialization to maximize search performance
> -------------------------------------------------------------
>                 Key: LUCENE-1594
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments:, LUCENE-1594.patch, LUCENE-1594.patch, LUCENE-1594.patch
> Towards eeking absolute best search performance, and after seeing the
> Java ghosts in LUCENE-1575, I decided to build a simple prototype
> source code specializer for Lucene's searches.
> The idea is to write dynamic Java code, specialized to run a very
> specific query context (eg TermQuery, collecting top N by field, no
> filter, no deletions), compile that Java code, and run it.
> Here're the performance gains when compared to trunk:
> ||Query||Sort||Filt|Deletes||Scoring||Hits||QPS (base)||QPS (new)||%||
> |1|Date (long)|no|no|Track,Max|2561886|6.8|10.6|{color:green}55.9%{color}|
> |1|Date (long)|no|5%|Track,Max|2433472|6.3|10.5|{color:green}66.7%{color}|
> |1|Date (long)|25%|no|Track,Max|640022|5.2|9.9|{color:green}90.4%{color}|
> |1|Date (long)|25%|5%|Track,Max|607949|5.3|10.3|{color:green}94.3%{color}|
> |1|Date (long)|10%|no|Track,Max|256300|6.7|12.3|{color:green}83.6%{color}|
> |1|Date (long)|10%|5%|Track,Max|243317|6.6|12.6|{color:green}90.9%{color}|
> |1|Relevance|no|no|Track,Max|2561886|11.2|17.3|{color:green}54.5%{color}|
> |1|Relevance|no|5%|Track,Max|2433472|10.1|15.7|{color:green}55.4%{color}|
> |1|Relevance|25%|no|Track,Max|640022|6.1|14.1|{color:green}131.1%{color}|
> |1|Relevance|25%|5%|Track,Max|607949|6.2|14.4|{color:green}132.3%{color}|
> |1|Relevance|10%|no|Track,Max|256300|7.7|15.6|{color:green}102.6%{color}|
> |1|Relevance|10%|5%|Track,Max|243317|7.6|15.9|{color:green}109.2%{color}|
> |1|Title (string)|no|no|Track,Max|2561886|7.8|12.5|{color:green}60.3%{color}|
> |1|Title (string)|no|5%|Track,Max|2433472|7.5|11.1|{color:green}48.0%{color}|
> |1|Title (string)|25%|no|Track,Max|640022|5.7|11.2|{color:green}96.5%{color}|
> |1|Title (string)|25%|5%|Track,Max|607949|5.5|11.3|{color:green}105.5%{color}|
> |1|Title (string)|10%|no|Track,Max|256300|7.0|12.7|{color:green}81.4%{color}|
> |1|Title (string)|10%|5%|Track,Max|243317|6.7|13.2|{color:green}97.0%{color}|
> Those tests were run on a 19M doc wikipedia index (splitting each
> Wikipedia doc @ ~1024 chars), on Linux, Java 1.6.0_10
> But: it only works with TermQuery for now; it's just a start.
> It should be easy for others to run this test:
>   * apply patch
>   * cd contrib/benchmark
>   * run python -u -delindex </path/to/index/with/deletes>
>     -nodelindex </path/to/index/without/deletes>
> (You can leave off one of -delindex or -nodelindex and it'll skip
> those tests).
> For each test, generates a single Java source file that runs
> that one query; you can open
> contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/
> to see it.  I'll attach an example.  It writes "results.txt", in Jira
> table format, which you should be able to copy/paste back here.
> The specializer uses pretty much every search speedup I can think of
> -- the ones from LUCENE-1575 (to score or not, to maxScore or not),
> the ones suggested in the spinoff LUCENE-1593 (pre-fill w/ sentinels,
> don't use docID for tie breaking), LUCENE-1536 (random access
> filters).  It bypasses TermDocs and interacts directly with the
> IndexInput, and with BitVector for deletions.  It directly folds in
> the collector, if possible.  A filter if used must be random access,
> and is assumed to pre-multiply-in the deleted docs.
> Current status:
>   * I only handle TermQuery.  I'd like to add others over time...
>   * It can collect by score, or single field (with the 3 scoring
>     options in LUCENE-1575).  It can't do reverse field sort nor
>     multi-field sort now.
>   * The auto-gen code ( is rather hideous.  It could use some
>     serious refactoring, etc.; I think we could get it to the point
>     where each Query can gen its own specialized code, maybe.  It also
>     needs to be eventually ported to Java.
>   * The script runs old, then new, then checks that the topN results
>     are identical, and aborts if not.  So I'm pretty sure the
>     specialized code is working correctly, for the cases I'm testing.
>   * The patch includes a few small changes to core, mostly to open up
>     package protected APIs so I can access stuff
> I think this is an interesting effort for several reasons:
>   * It gives us a best-case upper bound performance we can expect from
>     Lucene's normal search classes (minus algorithmic improvements eg
>     PFOR) because it makes life as easy as possible on the
>     compiler/JRE to convert to assembly.
>   * We can spin out optimization ideas from this back into the core
>     (eg LUCENE-1593 already has one example), and prioritize.  EG I
>     think given these results, optimizing for filters that support
>     random-access API is important.  As we fold speedups back into
>     core, the gains from specialization will naturally decrease.
>   * Eventually (maybe, eg as a future "experimental" module) this can
>     be used in production as a simple "search wrapper".  Ie, for a
>     given query, the specializer is checked.  If the query "matches"
>     what the specializer can handle, then the specialized code is run;
>     else we fallback to Lucene core.  Likely one would pre-compile the
>     space of all specializations, or we could compile java-on-the-fly
>     (eg what a JSP source does when it's changed) but I'm not sure how
>     costly/portable that is.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message