lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Lewis <>
Subject Fuzzy searching, tildes and solr
Date Tue, 23 Jan 2007 23:19:50 GMT
This is quite possibly a Lucene question rather than a solr one, so my 
apologies if you think its out of scope.

Underlying the solr search, are some very useful Lucene constructs.

One of the most powerful, imho, is the tilde number combination for a 
"fuzzy" search.

In one of my data sets
    q=Sutherland returns 41 results
    q=Sutherland~0.75 returns 275
    q=Sutherland~0.70 returns 484
etc. all of which fits a pattern Add a first name and
   q=(James Sutherland) returns 13
   q=(James~0.75 Sutherland~0.75) returns 1
    q=(James~0.70 Sutherland~0.70) returns 97
Qualify only one term and there is a consistent pattern.  But routinely 
qualifying two terms yields a smaller number than a string match.
   q=(James~0.75 AND Sutherland~0.75) returns the same record (the 
schema has default set to AND)

Why would the ~0.75 *narrow* rather than broaden a search? Is there some 
pattern in the solr syntax I'm overlooking?



View raw message