lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From PeterKerk <>
Subject Re: Indexing fieldvalues with dashes and spaces
Date Mon, 09 Aug 2010 07:54:04 GMT

Hi Erick,

Ok. its more clear now. I indeed have the whitespace tokenizer:

    <fieldType name="textTrue" class="solr.TextField"
positionIncrementGap="100" >
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="false"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_dutch.txt" />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" generateNumberParts="0" catenateWords="1"
catenateNumbers="1" catenateAll="0"/>
        <filter class="solr.ISOLatin1AccentFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="Dutch"

What happens is that I have a field called 'Beach & Sea", which is a theme
for a location. What happens because of the whitespace tokenizer, it gets
split up in 2 fields: 
(see below)

Ofcourse those individual facet names are NOT correct facetnames, because it
should be "Beach & Sea".
But if I REMOVE the whitespace tokenizer, it throws an error that a
fieldtype should always have a tokenizer.
But which tokenizer would I need in order for me to have the correct facet
(I've been checking this page


View this message in context:
Sent from the Solr - User mailing list archive at

View raw message