lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Brügge <daniel.brue...@googlemail.com>
Subject SolrCloud, Zookeeper and Stopwords with Umlaute or other special characters
Date Wed, 07 Nov 2012 16:45:45 GMT
Hi,

i am running a SolrCloud cluster with the 4.0.0 version. I have a stopwords
file
which is in the correct encoding. It contains german Umlaute like e.g. 'ü'.
I am
also running a standalone Zookeeper which contains this stopwords file. In
my schema
i am using the stopwords file in the standard way:

>
>     <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100">
>       <analyzer type="index">
>                 <tokenizer class="solr.StandardTokenizerFactory"/>
>                 <filter class="solr.StopFilterFactory"
>                                 ignoreCase="true"
>                                 words="my_stopwords.txt"
>                                 enablePositionIncrements="true" />


When I am indexing i recognized, that all stopwords without Umlaute are
correctly removed, but the ones with
Umlaute still exist.

Is this a problem with ZK or Solr?

Thanks & regards

Daniel

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message