lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Grotzke <martin.grot...@javakaffee.de>
Subject Indexing question - split word and comma
Date Thu, 05 Jul 2007 18:43:10 GMT
Hi all,

I have a document with a name field like this:
<field name='name'>MP3-Player, Apple, &#xBB;iPod nano&#xAB;, silber,
4GB</field>

and want to find "apple". Unfortunately, I only find "apple,"...

Can anybody help me with this?


The schema.xml containts the following field definition
<field name="name" type="text" indexed="true" stored="true"/>

and this fieldType definition for type text:
    <fieldtype name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"
expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldtype>

The default search field is "text":
<defaultSearchField>text</defaultSearchField>

with the following definition:
   <field name="text" type="text" indexed="true" stored="false"
multiValued="true"/>

and the copy from "name" to "text"...
<copyField source="name" dest="text"/>



Thanx in advance,
cheers,
Martin



Mime
View raw message