lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Turnbull <dturnb...@opensourceconnections.com>
Subject Re: Possible to disable SynonymQuery and get legacy behavior?
Date Tue, 21 Nov 2017 21:57:49 GMT
I have submitted a patch to make the query generated for overlapping query
terms somewhat configurable (w/ default being SynonymQuery), based on
practices I've seen in the field. I'd love to hear feedback

https://issues.apache.org/jira/browse/SOLR-11662

On Tue, Nov 21, 2017 at 12:37 PM Doug Turnbull <
dturnbull@opensourceconnections.com> wrote:

> We help clients that perform index-time semantic expansion to hypernyms at
> index time. For example, they will have a synonyms file that does the
> following
>
> wing_tips => wing_tips, dress_shoes, shoes
> dress_shoes => dress_shoes, shoes
> oxfords => oxfords, dress_shoes, shoes
>
> Then at query time, we rely on differing IDF of these terms in the same
> position to bring up the rare, specific terms matches, followed by
> increasingly semantically broad matches. For example, Previously, a search
> for wing_tips would get turned into "wing_tips OR dress_shoes OR shoes".
> Shoes being very common would get scored lowest. Wing tips being very
> specific would get scored very highly
>
> ( I have a blog post about this (which uses Elasticsearch)
>
> http://opensourceconnections.com/blog/2016/12/23/elasticsearch-synonyms-patterns-taxonomies/
>  )
>
> As our clients upgrade to Solr 6 and above, we're noticing our technique
> no longer works due to SynonymQuery, which blends the doc freq at query
> time of synonyms at query time. SynonymQuery seems to be the right
> direction for most people :) Still I would like to figure out how/if
> there's a setting anywhere to return to the legacy behavior (a boolean
> query of term queries) so I don't have to go back to the drawing board for
> clients that rely on this technique.
>
> I've been going through QueryBuilder and I don't see where we could go
> back to the legacy behavior. It seems to be based on position overlap.
>
> Thanks!
> -Doug
>
>
>
> --
> Consultant, OpenSource Connections. Contact info at
> http://o19s.com/about-us/doug-turnbull/; Free/Busy (
> http://bit.ly/dougs_cal)
>
-- 
Consultant, OpenSource Connections. Contact info at
http://o19s.com/about-us/doug-turnbull/; Free/Busy (http://bit.ly/dougs_cal)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message