lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Bigham <>
Subject Query Expansion for Synonyms
Date Thu, 28 Apr 2016 15:26:15 GMT
I'm investigating various ways of supporting synonyms in Lucene.

One such approach that looks potentially interesting is to do a kind of 
"query expansion".

For example, if the user searches for "us 1888", one might expand the 
query as follows:

     SpanNearQuery query =
     new SpanNearQuery(
         new SpanQuery[]
             new SpanOrQuery(
                 new SpanTermQuery(new Term("Plaintext", "us")),
                 new SpanNearQuery(
                     new SpanQuery[]
                         new SpanTermQuery(new Term("Plaintext", "united")),
                         new SpanTermQuery(new Term("Plaintext", "states"))
             new SpanTermQuery(new Term("Plaintext", "1888"))

A couple of questions:

- Is this approach in use within the community?
- Are there "gotchas" with this approach that make it undesirable?

I've done a few quick tests wrt query performance on a test index and 
found that a query can indeed take 10x longer if enough synonyms are 
used, but if the baseline search time is around 1 ms, then 10 ms is 
still plently fast enough. (that said, my test was on a 70 MB index, so 
my 10 ms might turn into something nasty with a 7 GB index)

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message