lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Updated: (LUCENE-2672) speed up automaton seeking in nextString
Date Mon, 27 Sep 2010 09:26:32 GMT


Robert Muir updated LUCENE-2672:

    Attachment: LUCENE-2672.patch

ok, heres a committable patch.

i put a safety in here to address my own concerns. so the optimization doesnt apply to infinite
(but these typically dont backtrack anyway)

i found a little perf problem with Standard's terms dict cache, we should avoid clone() on
these deep hierarchies
if theres a chance it will get called a lot. since the class in question is private static,
i changed how clone() was impled.

and i turned off terms dict cache for automaton, it doesnt seem to help in any query i test,
and for some
worst-case ones it slows things down (even with the cloning fix)... and queries like this
"trash" the cache anyway.

> speed up automaton seeking in nextString
> ----------------------------------------
>                 Key: LUCENE-2672
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Robert Muir
>            Priority: Minor
>             Fix For: 4.0
>         Attachments: LUCENE-2672.patch, LUCENE-2672.patch, LUCENE-2672.patch
> While testing, i found there are some queries (e.g. wildcard ?????????) that do quite
a lot of backtracking.
> nextString doesn't handle this particularly well, when it walks the DFA, if it hits a
dead-end and needs to backtrack, it increments the bytes, and starts over completely.
> alternatively it could save the path information in an int[], and backtrack() could return
a position to restart from, instead of just a boolean.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message