lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <>
Subject [jira] Commented: (LUCENE-2013) QueryScorer and SpanRegexQuery are incompatible.
Date Thu, 29 Oct 2009 15:59:59 GMT


Mark Miller commented on LUCENE-2013:

Nice catch - I think I like this method better than the core modifications.

bq. but this also means that no third-party queries have any way to influence their highlighting.

Unfortunately, I think thats already the deal in many cases. The Highlighter is very special
case - ugly, but the current state of things. We will hopefully get away from that eventually.

> QueryScorer and SpanRegexQuery are incompatible.
> ------------------------------------------------
>                 Key: LUCENE-2013
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/highlighter
>    Affects Versions: 2.9
>         Environment: Lucene-Java 2.9
>            Reporter: Benjamin Keil
>             Fix For: 3.0
>         Attachments: lucene-2013-2009-10-28-2135.patch, lucene-2013-2009-10-28.patch,
lucene-2013-2009-10-29-0136.patch, LUCENE-2013.patch
> Since the resolution of #LUCENE-1685, users are not supposed to rewrite their queries
before submitting them to QueryScorer:
> bq.------------------------------------------------------------------------
> bq.r800796 | markrmiller | 2009-08-04 06:56:11 -0700 (Tue, 04 Aug 2009) | 1 line
> bq.
> bq.LUCENE-1685: The position aware SpanScorer has become the default scorer for Highlighting.
The SpanScorer implementation has replaced QueryScorer and the old term highlighting QueryScorer
has been renamed to QueryTermScorer. Multi-term queries are also now expanded by default.
If you were previously rewritting the query for multi-term query highlighting, you should
no longer do that (unless you switch to using QueryTermScorer). The SpanScorer API (now QueryScorer)
has also been improved to more closely match the API of the previous QueryScorer implementation.
> bq.------------------------------------------------------------------------
> This is a great convenience for the most part, but it's causing me difficulties with
SpanRegexQuerys, as the WeightedSpanTermExtractor uses Query.extractTerms() to collect the
fields used in the query, but SpanRegexQuery does not implement this method, so highlighting
any query with a SpanRegexQuery throws an UnsupportedOpertationException.  If this issue is
circumvented, there is still the issue of SpanRegexQuery throwing an exception when someone
calls its getSpans() method.
> I can provide the patch that I am currently using, but I'm not sure that my solution
is optimal.  It adds two methods to SpanQuery: extractFields(Set<String> fields) which
is equivalent to fields.add(getField()) except when MaskedFieldQuerys get involved, and mustBeRewrittenToGetSpans()
which returns true for SpanQuery, false for SpanTermQuery, and is overridden in each composite
SpanQuery to return a value depending on its components.  In this way SpanRegexQuery (and
any other custom SpanQuerys) do not need to be adjusted.
> Currently the collection of fields and non-weighted terms are done in a single step.
 In the proposed patch the WeightedSpanTerm extraction from a SpanQuery proceeds in two steps.
 First, if the QueryScorer's field is null, then the fields are collected from the SpanQuery
using the extractFields() method.  Second the terms are collected using extractTerms(), rewriting
the query for each field if mustBeRewrittenToGetSpans() returns true.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message