lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Stop Words in SpellCheckComponent
Date Fri, 01 Jun 2012 06:05:46 GMT
Your earlier email had this option in your spellcheck.de field type analyzer 
for the StopFilterFactory:

words="german_stop_long.txt"

But your most recent email referred to "stopword.txt".

So, either add "the" to german_stop_long.txt, or change the "words" option 
of your stopfilter to refer to "stopwords.txt".

BTW, I think you can actually have a comma-separated list of stopword files, 
so you can write:

words="german_stop_long.txt,stopwords.txt"

-- Jack Krupansky

-----Original Message----- 
From: Matthias Müller
Sent: Friday, June 01, 2012 1:44 AM
To: solr-user@lucene.apache.org
Subject: Re: Stop Words in SpellCheckComponent

> <str name="field">spellcheck_de</str>
>
> That should reference a field, not a field type.

Thanks for your help. But I did that, too.

Here I'll show that even the solr example webapp makes suggestions for
stopwords: I've ...

1. added "the" to the stopwords.txt
2. added "thex" to an example document (field name)
3. startet solr
4. indexed the example files (sh post.sh *.xml)
5. searched for "the solr"
http://myhost:8983/solr/select?q=the+solr&spellcheck=true&wt=json
6. got the desired result, but also the wrong suggestion "thex"

{ "response" : { "docs" : [ {...  "name" : "Solr, thex Enterprise
Search Server", ..  } ],
      "numFound" : 1,
...  },
...
  "spellcheck" : { "suggestions" : [ "the",
          {...            "suggestion" : [ "thex" ]  }
        ] }
}


Here's the complete diff between the original download and my 3 
modifications:

diff -r apache-solr-3.6.0/example/exampledocs/solr.xml
apache-solr-3.6.0x/example/exampledocs/solr.xml
21c21
<   <field name="name">Solr, the Enterprise Search Server</field>
---
>   <field name="name">Solr, thex Enterprise Search Server</field>
diff -r apache-solr-3.6.0/example/solr/conf/solrconfig.xml
apache-solr-3.6.0x/example/solr/conf/solrconfig.xml
781a782,785
>      <arr name="last-components">
>        <str>spellcheck</str>
>      </arr>
>
1122a1127
>       <str name="buildOnCommit">true</str>
diff -r apache-solr-3.6.0/example/solr/conf/stopwords.txt
apache-solr-3.6.0x/example/solr/conf/stopwords.txt
14a15,16
>
> the 


Mime
View raw message