lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amar Raja <amar.r...@thecommercepartnership.com>
Subject Re: SynonymGraphFilterFactory with edismax
Date Thu, 02 Nov 2017 15:02:04 GMT
Thanks Steve,

We have a smoking gun! I am on 6.5.1, and have tested in 7.1 and I don't
see the same issue.

I can't upgrade just yet, however I have found setting mm=1 sorts this out
in my case, giving me the following:

(+(+DisjunctionMaxQuery((((web_name:metal (+web_name:rose
+web_name:gold))~1))~1.0)))/no_coord

I am still testing, however it looks positive so far.

One thing I still noticed is the single word synonyms output "Synonym(...)"
within the debug query, but multi-word do not - even in 7.1.0.

Is this an issue? To be honest, I am not sure what it means, just something
I noticed as a difference in the parsed query.

Thanks again for your help, I thought I was going mad for a while.



On 2 November 2017 at 14:38, Steve Rowe <sarowe@gmail.com> wrote:

> Hi Amar,
>
> What version of Solr are you using?  This looks like a bug that was fixed
> in Solr 6.6.1: <https://issues.apache.org/jira/browse/LUCENE-7878>.
>
> --
> Steve
> www.lucidworks.com
>
> > On Nov 2, 2017, at 8:31 AM, Amar Raja <amar.raja@
> thecommercepartnership.com> wrote:
> >
> > Hello,
> >
> > I have the following field definition:
> >
> > <fieldType name="text_en" class="solr.TextField"
> positionIncrementGap="100">
> >  <analyzer type="query">
> >    <tokenizer class="solr.StandardTokenizerFactory"/>
> >    <filter class="solr.SynonymGraphFilterFactory"
> synonyms="synonyms.txt"
> > ignoreCase="true" expand="true"/>
> >    <filter class="solr.StopFilterFactory" ignoreCase="true"
> > words="lang/stopwords_en.txt" />
> >    <filter class="solr.LowerCaseFilterFactory"/>
> >    <filter class="solr.EnglishPossessiveFilterFactory"/>
> >    <filter class="solr.KeywordMarkerFilterFactory"
> > protected="protwords.txt"/>
> >    <filter class="solr.PorterStemFilterFactory"/>
> >  </analyzer>
> > </fieldType>
> >
> > And the following two synonym definitions:
> >
> > kids => boys,girls
> > metallic => rose gold,metallic
> >
> > The intent being a user searching for "kids" should get girls or boys
> > results, but searching for "boys" will not bring back girls results.
> > Similarly searching for "metallic" should bring back results for either
> > "metallic" or "rose gold", but the search for "rose gold" should not
> bring
> > back "metallic".
> >
> > Another property I have set is q.op=AND. I.e. "boys tops" should return
> > where only both terms exist.
> >
> > The first synonym works well, producing the following dismax query:
> >
> > (+(+DisjunctionMaxQuery((Synonym(web_name:boi
> > web_name:girl))~1.0)))/no_coord
> >
> > However, for the second I get this:
> >
> > (+(+DisjunctionMaxQuery(((((+web_name:rose +web_name:gold)
> > web_name:metal)~2))~1.0)))/no_coord
> >
> > But for any terms where any of the terms in the RHS have multiple terms,
> it
> > seems to want to match both synonyms, so in this case only documents with
> > both "metallic" and "rose gold" will match.
> >
> > Any ideas where I am going wrong?
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message