lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Pendlebury (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators
Date Thu, 01 May 2014 06:52:15 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986408#comment-13986408
] 

Greg Pendlebury commented on SOLR-2649:
---------------------------------------

I applied this patch to 4.7.2 Yesterday and tried it out on or dev servers. At first I thought
it was pretty bad and failed completely... but then I had a good think and re-read everything
on this ticket and this[1] article and realised my understanding of the problem was flawed.
Using just this patch in isolation it converted all of the OR operators to AND operators with
mm=100%. Very confusing behaviour for our business area, but I realise now that it is correct.

Perhaps the confusion stems from the way the q.op and mm parameters interact. If the behaviour
was to instead separate them more clearly then we could change the config entirely. At the
moment our mm is 100% because we effectively want q.op=AND, but if q.op was instead applied
1) always, 2) first and 3) independently from mm (ie. insert AND wherever an operator is missing)
we could set mm=1 and achieve what we want by respecting the OR parameters provided by the
user.

I've added this on top of the patch already here and deployed again to our dev servers using
'q.op=AND & mm=1' and now everything appears to function as it should. I'll upload the
patch in a minute, and it includes several unit tests with different mm and q.op values. From
my perspective I think the two parameters are interacting appropriately, but perhaps someone
with more convoluted mm settings could give it a try?

The change is simply in the constructor of the ExtendedSolrQueryParser class where it was
hardcoded to force the default operator to OR (presumably so that mm would take care of things)
I've made it look at the parameter provided with the query (copied the code from the Simple
QParser and adjusted to fit).

The unit test from the first patch that was marked TODO I have tweaked slightly. I think not
finding a result in that case is entirely appropriate if the user can now tweak q.op. Opinions
may vary of course.

[1] http://searchhub.org/2011/12/28/why-not-and-or-and-not/

> MM ignored in edismax queries with operators
> --------------------------------------------
>
>                 Key: SOLR-2649
>                 URL: https://issues.apache.org/jira/browse/SOLR-2649
>             Project: Solr
>          Issue Type: Bug
>          Components: query parsers
>            Reporter: Magnus Bergmark
>            Priority: Minor
>             Fix For: 4.9, 5.0
>
>         Attachments: SOLR-2649.diff, SOLR-2649.patch
>
>
> Hypothetical scenario:
>   1. User searches for "stocks oil gold" with MM set to "50%"
>   2. User adds "-stockings" to the query: "stocks oil gold -stockings"
>   3. User gets no hits since MM was ignored and all terms where AND-ed together
> The behavior seems to be intentional, although the reason why is never explained:
>   // For correct lucene queries, turn off mm processing if there
>   // were explicit operators (except for AND).
>   boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; 
> (lines 232-234 taken from tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
> This makes edismax unsuitable as an replacement to dismax; mm is one of the primary features
of dismax.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message