lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Krupansky <jack.krupan...@gmail.com>
Subject Re: Invalid parsing with solr edismax operators
Date Wed, 04 Nov 2015 12:55:57 GMT
It is debatable whether this is a bug or just a poorly documented
interaction of q.op, mm, and nested queries (within parentheses.)
Personally, I'd say it is a bug. Edismax is only obeying q.op and mm for
the top-level of the query - once you nest within parentheses the default
operator reverts to Lucene's internal default of OR. Why the second query
is treated differently with regard to those parentheses is baffling, some
quirk of the query parser, which few people have a solid handle on. I
suspect that the fact that there is no additional query terms or operators
around those top-level parentheses is causing the query parser logic to act
as if the parentheses were not there.

You neglected to give us your qf parameter, but obviously it is:
qf=Title^200.0 TotalField, I think.

-- Jack Krupansky

On Wed, Nov 4, 2015 at 3:39 AM, Mahmoud Almokadem <prog.mahmoud@gmail.com>
wrote:

> Hello,
>
> I'm using solr 4.8.1. Using edismax as the parser we got the undesirable
> parsed queries and results. The following is two different cases with
> strange behavior: Searching with these parameters
>
>   "mm":"2",
>   "df":"TotalField",
>   "debug":"true",
>   "indent":"true",
>   "fl":"Title",
>   "start":"0",
>   "q.op":"AND",
>   "fq":"",
>   "rows":"10",
>   "wt":"json"
> and the query is
>
> "q":"+(public libraries)",
> Retrieve 502 documents with these parsed query
>
> "rawquerystring":"+(public libraries)",
> "querystring":"+(public libraries)",
> "parsedquery":"(+(+(DisjunctionMaxQuery((Title:public^200.0 |
> TotalField:public^0.1)) DisjunctionMaxQuery((Title:libraries^200.0 |
> TotalField:libraries^0.1)))))/no_coord",
> "parsedquery_toString":"+(+((Title:public^200.0 | TotalField:public^0.1)
> (Title:libraries^200.0 | TotalField:libraries^0.1)))"
> and if the query is
>
> "q":" (public libraries) "
> then it retrieves 8 documents with these parsed query
>
> "rawquerystring":" (public libraries) ",
> "querystring":" (public libraries) ",
> "parsedquery":"(+((DisjunctionMaxQuery((Title:public^200.0 |
> TotalField:public^0.1)) DisjunctionMaxQuery((Title:libraries^200.0 |
> TotalField:libraries^0.1)))~2))/no_coord",
> "parsedquery_toString":"+(((Title:public^200.0 | TotalField:public^0.1)
> (Title:libraries^200.0 | TotalField:libraries^0.1))~2)"
> So the results of adding "+" to get all tokens before the parenthesis
> retrieve more results than removing it.
>
> Is this a bug on this version or there are something missing?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message