lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com>
Subject Re: Whole unfiltered content in response document field
Date Sat, 07 May 2011 16:21:56 GMT

> <analyzer type="index">
>         <tokenizer
> class="solr.WhitespaceTokenizerFactory"/>
>         <filter
> class="solr.StopFilterFactory" 
>                
> ignoreCase="true" 
>                
> words="stopwords.txt" 
>                
> enablePositionIncrements="true" 
>                
> />
>         <filter
> class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1"
> catenateWords="1"
> catenateNumbers="1" catenateAll="0"
> splitOnCaseChange="1"/>
>         <filter
> class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> 
> ... 
> 
>  <fields>
>    <field name="id" type="int"
> indexed="true" stored="true" required="true"
> />  
>    <field name="text" type="text"
> indexed="true" stored="true"/>
>  </fields>
> 
> On the analysis view, my filters work poperly. On the end
> of the filter
> chain I have only interest tokens. But when I search with
> Solr, I become as
> a response the whole content of the indexed databse field.
> The field
> contains stopwords, whitespaces, upercases and so on. I
> search for
> stopwords, and I can find them. I would expect, I find in
> the response
> document only the filtered content in the field and not the
> original raw
> content that I would to index. 
> 
> Is this a normal behaviour? Do I understand Solr right? 

On the response, solr shows raw content. So you want to see analyzed/indexed content of a
document in the response?

Searching and finding stop-words is not normal. May be you need to move StopFilter to under
the WordDelimeter. Some punctuations may cause this.

Mime
View raw message