lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luca Cavanna (JIRA)" <>
Subject [jira] [Commented] (LUCENE-5718) More flexible compound queries (containing mtq) support in postings highlighter
Date Sun, 01 Jun 2014 14:43:01 GMT


Luca Cavanna commented on LUCENE-5718:

Also worth mentioning that my patch addresses only the compound queries usecase. It leaves
the automaton related work for the different multi term queries as it is (in MultiTermHighlighting).

> More flexible compound queries (containing mtq) support in postings highlighter
> -------------------------------------------------------------------------------
>                 Key: LUCENE-5718
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/highlighter
>    Affects Versions: 4.8.1
>            Reporter: Luca Cavanna
>         Attachments: LUCENE-5718.patch
> The postings highlighter currently pulls the automata from multi term queries and doesn't
require calling rewrite to make highlighting work. In order to do so it also needs to check
whether the query is a compound one and eventually extract its subqueries. This is currently
done in the MultiTermHighlighting class and works well but has two potential problems:
> 1) not all the possible compound queries are necessarily supported as we need to go over
each of them one by one (see LUCENE-5717) and this requires keeping the "switch" up-to-date
if new queries gets added to lucene
> 2) it doesn't support custom compound queries but only the set of queries available out-of-the-box
> I've been thinking about how this can be improved and one of the ideas I came up with
is to introduce a generic way to retrieve the subqueries from compound queries, like for instance
have a new abstract base class with a getLeaves or getSubQueries method and have all the compound
queries extend it. What this method would do is return a flat array of all the leaf queries
that the compound query is made of. 
> Not sure whether this would be needed in other places in lucene, but it doesn't seem
like a small change and it would definitely affect (or benefit?) more than just the postings
highlighter support for multi term queries.
> In particular the second problem (custom queries) seems hard to solve without a way to
expose this info directly from the query though, unless we want to make the MultiTermHighlighting#extractAutomata
method extensible in some way.
> Would like to hear what people think and work on this as soon as we identified which
direction we want to take.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message