lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From కామేశ్వర రావు భైరవభట్ల <>
Subject Search for misspelled words in corpus
Date Wed, 05 Jun 2013 13:10:25 GMT

I have a problem where our text corpus on which we need to do search
contains many misspelled words. Same word could also be misspelled in
several different ways. It could also have documents that have correct
spellings However, the search term that we give in query would always be
correct spelling. Now when we search on a term, we would like to get all
the documents that contain both correct and misspelled forms of the search
We tried fuzzy search, but it doesn't work as per our expectations. It
returns any close match, not specifically misspelled words. For example, if
I'm searching for a word like "fight", I would like to return the documents
that have words like "figth" and "feight", not documents with words like
"sight" and "light".
Is there any suggested approach for doing this?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message