lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] [Commented] (LUCENE-6664) Replace SynonymFilter with SynonymGraphFilter
Date Sun, 02 Aug 2015 19:56:04 GMT


Robert Muir commented on LUCENE-6664:

OK, if you say its a generalization, then I am ok.  But you are saying current code in queryparsers
won't do the wrong thing?: I know its a tough one, since it already is somewhat wrong today!?
I'm just asking because we dont want to make it worse or more confusing.

> Replace SynonymFilter with SynonymGraphFilter
> ---------------------------------------------
>                 Key: LUCENE-6664
>                 URL:
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 5.3, Trunk
>         Attachments: LUCENE-6664.patch, LUCENE-6664.patch, LUCENE-6664.patch, LUCENE-6664.patch,
usa.png, usa_flat.png
> Spinoff from LUCENE-6582.
> I created a new SynonymGraphFilter (to replace the current buggy
> SynonymFilter), that produces correct graphs (does no "graph
> flattening" itself).  I think this makes it simpler.
> This means you must add the FlattenGraphFilter yourself, if you are
> applying synonyms during indexing.
> Index-time syn expansion is a necessarily "lossy" graph transformation
> when multi-token (input or output) synonyms are applied, because the
> index does not store {{posLength}}, so there will always be phrase
> queries that should match but do not, and then phrase queries that
> should not match but do.
> goes into detail about this.
> However, with this new SynonymGraphFilter, if instead you do synonym
> expansion at query time (and don't do the flattening), and you use
> TermAutomatonQuery (future: somehow integrated into a query parser),
> or maybe just "enumerate all paths and make union of PhraseQuery", you
> should get 100% correct matches (not sure about "proper" scoring
> though...).
> This new syn filter still cannot consume an arbitrary graph.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message