lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-6801) PhraseQuery incorrectly advertises it supports terms at the same position
Date Mon, 19 Oct 2015 13:49:05 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14963325#comment-14963325
] 

Adrien Grand commented on LUCENE-6801:
--------------------------------------

Oh, I think I just understood the intention of these javadocs: imagine you have a token stream
that sometimes puts terms at the same position and you want to find all documents that have
both terms A and B at the same position. You can do it with PhraseQuery. On the other hand,
what MultiPhraseQuery allows you to do is to search for documents that have either A or B
at a given position (relatively to some other terms). Then maybe the javadoc change should
be to say that PhraseQuery will try to match all terms at a given position, and that for the
synonym use-case MultiPhraseQuery should be used instead?

> PhraseQuery incorrectly advertises it supports terms at the same position
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-6801
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6801
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>            Reporter: David Smiley
>            Priority: Minor
>         Attachments: LUCENE_6801.patch
>
>
> The following in PhraseQuery has been here since Sept 15th 2004 (by "goller"):
> {code:java}
>     /**
>      * Adds a term to the end of the query phrase.
>      * The relative position of the term within the phrase is specified explicitly.
>      * This allows e.g. phrases with more than one term at the same position
>      * or phrases with gaps (e.g. in connection with stopwords).
>      * 
>      */
>     public Builder add(Term term, int position) {
> {code}
> Of course this isn't true; it's why we have MultiPhraseQuery.  Yet we even allow you
to have consecutive terms with the same positions.  We shouldn't allow that; we should throw
an exception.  For my own sanity, I modified a simple MultiPhraseQuery test to use PhraseQuery
instead and of course it didn't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message