lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: How to properly use Levenstein distance with ~ in Java
Date Tue, 21 Oct 2014 14:40:28 GMT
When used on bare terms, ~ is indeed "fuzzy matching" rather than
proximity, it's an overloaded operator in that sense.

If I had to guess, I'd guess that your analysis chain for the field is
doing "interesting" things for "taveranx" and the resulting token is
far enough "away" (in the Levenshtein sense) that it's not found.

The admin/analysis page is very much your friend here, it'll show you
what the term taveranx becomes in your index.

You might try varying the "closeness" of the term by adding
taveranx~0.2 (or whatever) to your query to see if it's eventually
found.

And as a test see if specifying fuzzy operations works on other terms,
in which case my hypothesis will get a little support....

Best,
Erick



On Tue, Oct 21, 2014 at 1:07 AM, Ramzi Alqrainy
<ramzi.alqrainy@gmail.com> wrote:
> Because ~ is proximity matching. Lucene supports finding words are a within a
> specific distance away.
> Search for "foo bar" within 4 words from each other.
>
> "foo bar"~4
>
> Note that for proximity searches, exact matches are proximity zero, and word
> transpositions (bar foo) are proximity 1.
> A query such as "foo bar"~10000000 is an interesting alternative to foo AND
> bar.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/How-to-properly-use-Levenstein-distance-with-in-Java-tp4164793p4165079.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message