lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Furkan KAMACI <furkankam...@gmail.com>
Subject Re: What is edit distance 2 mean for fuzzy queries?
Date Mon, 05 Aug 2019 00:40:53 GMT
Hi Baris,

Terms of length 1 or 2 will sometimes not match because of how the scaled
distance between two terms is computed. For a term to match, the edit
distance between the terms must be less than the minimum length term
(either the input term, or the candidate term). For example, FuzzyQuery on
term "abcd" with maxEdits=2 will not match an indexed term "ab", and
FuzzyQuery on term "a" with maxEdits=2 will not match an indexed term "abc".

You can check it from here:
https://lucene.apache.org/core/8_2_0/core/org/apache/lucene/search/FuzzyQuery.html

Kind Regards,
Furkan KAMACI

On Fri, Jun 28, 2019 at 10:10 PM <baris.kazar@oracle.com> wrote:

> i think i have an answer for this one:
>
> for words shorter or equal to length 5 the edit distance 1 works only
> but as the word gets longer
>
> i see edit distance 2 works ok.
>
> Best regards
>
>
> On 6/28/19 2:25 PM, baris.kazar@oracle.com wrote:
> > Hi,-
> >
> > search Query: [+streetDFLT:ridg~2, +cityDFLT:"nashua",
> > +regionDFLT:"new-hampshire", +countryDFLT:"united" +countryDFLT:"states"]
> >
> > Name: Ridge Rd
> > Score: 35.297863
> > ID: 10242301
> > Country Code: US
> > Coordinates: 42.70569, -71.49599
> > Search Key: street="RIDGE" city="NASHUA" municipality="HILLSBOROUGH"
> > region="NEW HAMPSHIRE" country="UNITED STATES"
> >
> > for this i can find RIDGE but
> >
> > with search string RID i cant find RIDGE at all:
> >
> > search Query: [+streetDFLT:rid~2, +cityDFLT:"nashua",
> > +regionDFLT:"new-hampshire", +countryDFLT:"united" +countryDFLT:"states"]
> >
> > Name: NASHUA
> > Score: 28.291311
> > ID: 21014865
> > Country Code: US
> > Coordinates: 42.75873, -71.46438
> > Search Key: city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> > HAMPSHIRE" country="UNITED STATES"
> >
> > Name: NASHUA
> > Score: 28.291311
> > ID: 21014865
> > Country Code: US
> > Coordinates: 42.75873, -71.46438
> > Search Key: city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> > HAMPSHIRE" country="UNITED STATES"
> >
> > Name: NASHUA
> > Score: 28.291311
> > ID: 21014865
> > Country Code: US
> > Coordinates: 42.75873, -71.46438
> > Search Key: city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> > HAMPSHIRE" country="UNITED STATES"
> >
> > Name: NASHUA
> > Score: 28.291311
> > ID: 21014865
> > Country Code: US
> > Coordinates: 42.75873, -71.46438
> > Search Key: city="NASHUA" municipality="HILLSBOROUGH" region="NEW
> > HAMPSHIRE" country="UNITED STATES"
> >
> > Name: Pennichuck St
> > Score: 28.291311
> > ID: 8022314
> > Country Code: US
> > Coordinates: 42.79266, -71.46672
> > Search Key: street="PENNICHUCK" city="NASHUA"
> > municipality="HILLSBOROUGH" region="NEW HAMPSHIRE" country="UNITED
> > STATES"
> >
> > Name: Hartford Ln
> > Score: 28.291311
> > ID: 9817672
> > Country Code: US
> > Coordinates: 42.78252, -71.49689
> > Search Key: street="HARTFORD" city="NASHUA"
> > municipality="HILLSBOROUGH" region="NEW HAMPSHIRE" country="UNITED
> > STATES"
> >
> > Name: Marblehead Dr
> > Score: 28.291311
> > ID: 12762505
> > Country Code: US
> > Coordinates: 42.79743, -71.50919
> > Search Key: street="MARBLEHEAD" city="NASHUA"
> > municipality="HILLSBOROUGH" region="NEW HAMPSHIRE" country="UNITED
> > STATES"
> >
> > RID is two edit distances away from RIDGE , right? Should i enable
> > something during indexing for fuzzy queries?
> > Best regards
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message