lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Spencer <>
Subject Re: combining open office spellchecker with Lucene
Date Thu, 09 Sep 2004 16:01:12 GMT
Aad Nales wrote:

> Hi All,
> Before I start reinventing wheels I would like to do a short check to
> see if anybody else has already tried this. A customer has requested us
> to look into the possibility to perform a spell check on queries. So far
> the most promising way of doing this seems to be to create an Analyzer
> based on the spellchecker of OpenOffice. My question is: "has anybody
> tried this before?" 

I did a WordNet/synonym query expander. Search for "WordNet" on this 
page. Of interest is it stores the Wordnet info in a separate Lucene 
index as at its essence all an index is is a database.

Also, another variation, is to instead spell based on what terms are in 
the index, not what an external dictionary says. I've done this on my 
experimental site in a dumb/inefficient way. Here's an 

After you click above it takes ~10sec as it produces terms close to 
"recursivz". Opps - looking at the output, it looks like the same word 
is suggest multiple times - ouch - I must be considering all fields, not 
just the contents field. TBD is fixing this. (or no wonder it's so slow :))

I can/should send the code out. The logic is that for any terms in a 
query that have zero matches, go thru all the terms(!) and calculate the 
Levenshtein string distance, and return the best matches. A more 
intelligent way of doing this is to instead look for terms that also 
match on the 1st "n" (prob 3) chars.

> Cheers,
> Aad
> --
> Aad Nales
>, +31-(0)6 54 207 340 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message