lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sam ” <skyn...@gmail.com>
Subject can I use different tokenizer/analyzer for facet count query?
Date Wed, 25 Apr 2012 14:41:01 GMT
I have the following in schema.xml
    <fieldType name="cq_tag" class="solr.TextField"
positionIncrementGap="100">
        <analyzer type="index">
            <tokenizer class="solr.PathHierarchyTokenizerFactory"
delimiter="$"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.KeywordTokenizerFactory"/>
        </analyzer>
    </fieldType>
    <field name="colors"             type="cq_tag"      indexed="true"
stored="true" multiValued="true"/>


And, I have the following doc:
<doc>
    <arr name="colors">
        <str>blues$Teal/Turquoise</str>
    </arr>
    ...
</doc>


Response of the query:
http://localhost:8983/solr/select/?q=*:*&facet=true&facet.field=colors&rows=100
is

<lst name="facet_counts">
    <lst name="facet_queries"/>
    <lst name="facet_fields">
        <lst name="colors">
              <int name="blues">1</int>
              <int name="blues$Teal/Turquoise">1</int>
         </lst>
    </lst>
    <lst name="facet_dates"/>
    <lst name="facet_ranges"/>
</lst>



During index,  blues$Teal/Turquoise  is tokenized into:
blues
blues$Teal/Turquoise

I think that's why facet count includes both blues and blues$Teal/Turquoise.

Can I have facet count only include the whole keyword,
blues$Teal/Turquoise,  not blues?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message