lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yao Ge <yao...@gmail.com>
Subject Re: spell checking
Date Wed, 03 Jun 2009 01:42:48 GMT

Excellent. Now everything make sense to me. :-)

The spell checking suggestion is the closest variance of user input that
actually existed in the main index. So called "correction" is relative the
text existed indexed. So there is no need for a brute force list of all
correctly spelled words. Maybe we should call this "alternative search
terms" or "suggested search terms" instead of spell checking. It is
misleading as there is no right or wrong in spelling, there is only popular
(term frequency?) alternatives.

Thanks for the insight.


Otis Gospodnetic wrote:
> 
> 
> Hello,
> 
> In short, the assumption behind this type of SC is that the text in the
> main index is (mostly) correctly spelled.  When the SC finds query
> terms that are close in spelling to words indexed in SC, it offers
> spelling suggestions/correction using those presumably correctly spelled
> terms (there are other parameters that control the exact behaviour, but
> this is the idea)
> 
> Solr (Lucene's spellchecker, which Solr uses under the hood, actually)
> turn the input text (values from those fields you copy to the spell field)
> into so called n-grams.  You can see that if you open up the SC index with
> something like Luke.  Please see
> http://wiki.apache.org/jakarta-lucene/SpellChecker .
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> ----- Original Message ----
>> From: Yao Ge <yaogee@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, June 2, 2009 5:34:07 PM
>> Subject: Re: spell checking
>> 
>> 
>> Sorry for not be able to get my point across.
>> 
>> I know the syntax that leads to a index build for spell checking. I
>> actually
>> run the command saw some additional file created in data\spellchecker1
>> directory. What I don't understand is what is in there as I can not trick
>> Solr to make spell suggestions based on the documented query structure in
>> wiki. 
>> 
>> Can anyone tell me what happened after when the default spell check is
>> built? In my case, I used copyField to copy a couple of text fields into
>> a
>> field called "spell". These fields are the original text, they are the
>> ones
>> with typos that I need to run spell check on. But how can these original
>> data be used as a base for spell checking? How does Solr know what are
>> correctly spelled words?
>> 
>>   
>> multiValued="true"/>
>>   
>> multiValued="true"/>
>>    ...
>>   
>> multiValued="true"/>
>>    ...
>>   
>>   
>> 
>> 
>> 
>> Yao Ge wrote:
>> > 
>> > Can someone help providing a tutorial like introduction on how to get
>> > spell-checking work in Solr. It appears many steps are requires before
>> the
>> > spell-checkering functions can be used. It also appears that a
>> dictionary
>> > (a list of correctly spelled words) is required to setup the spell
>> > checker. Can anyone validate my impression?
>> > 
>> > Thanks.
>> > 
>> 
>> -- 
>> View this message in context: 
>> http://www.nabble.com/spell-checking-tp23835427p23841373.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/spell-checking-tp23835427p23844050.html
Sent from the Solr - User mailing list archive at Nabble.com.


Mime
View raw message