lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From కామేశ్వర రావు భైరవభట్ల <kamesh...@gmail.com>
Subject Re: Search for misspelled words in corpus
Date Mon, 10 Jun 2013 05:03:11 GMT
Hi Upayavira,

The word I am searching for is "fight". Terms like "figth", "figh" are
spelling mistakes of fight. So I would like to find them. "sight" is
obviously not a spelling mistake of "fight". Even if it was a typo, I don't
really want to match "sight" with "fight".

regards,
Kamesh

On Sun, Jun 9, 2013 at 10:49 PM, Upayavira <uv@odoko.co.uk> wrote:

> You haven't stated why figh is correct and sight isn't. Is it because
> the first letter is different?
>
> Upayavira
>
> On Wed, Jun 5, 2013, at 02:10 PM, కామేశ్వర రావు భైరవభట్ల
wrote:
> > Hi,
> >
> > I have a problem where our text corpus on which we need to do search
> > contains many misspelled words. Same word could also be misspelled in
> > several different ways. It could also have documents that have correct
> > spellings However, the search term that we give in query would always be
> > correct spelling. Now when we search on a term, we would like to get all
> > the documents that contain both correct and misspelled forms of the
> > search
> > term.
> > We tried fuzzy search, but it doesn't work as per our expectations. It
> > returns any close match, not specifically misspelled words. For example,
> > if
> > I'm searching for a word like "fight", I would like to return the
> > documents
> > that have words like "figth" and "feight", not documents with words like
> > "sight" and "light".
> > Is there any suggested approach for doing this?
> >
> > regards,
> > Kamesh
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message