lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Sturlese <marc.sturl...@gmail.com>
Subject Re: How to avoid space on facet field
Date Wed, 03 Jun 2009 12:06:25 GMT

Yeah, that's the point. Once you have this, you can use copyField as was
wrote above with the "string" example.

Bny Jo wrote:
> 
> Anshuman, thanks for you input. I will try that, I can understand what you
> are trying.  
> 
> Marcus, I did not understand  how your KeyworkTokenizer work. Is that I
> have to define a septate field like what we have in example schema and
> call that field. This what I came up with.
> 
>  <fieldType name="facet_tex" class="solr.TextField" sortMissingLast="true"
> omitNorms="true">
>       <analyzer>
>     
>         <tokenizer class="solr.KeywordTokenizerFactory"/>
>           <filter class="solr.LowerCaseFilterFactory" />
>         <!-- The TrimFilter removes any leading or trailing whitespace -->
>         <filter class="solr.TrimFilterFactory" />
>        
>         <filter class="solr.PatternReplaceFilterFactory"
>                 pattern="([^a-z])" replacement="" replace="all"
>         />
>       </analyzer>
>     </fieldType>
> 
> 
> 
> Thanks
> 
> Boney
> 
> 
> ________________________________
> From: Marc Sturlese <marc.sturlese@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, June 3, 2009 3:45:49 AM
> Subject: Re: How to avoid space on facet field
> 
> 
> You can configure a "facet_text" instead of the normal "text" type. There
> you
> use KeyWordTokenizer instead of StandardTokenizer. One of the advantages
> of
> using it instead of "string" is that it will allow you to use synonyms,
> stopwords and filters and all the properties from an analyzer.
> 
> 
> Anshuman Manur wrote:
>> 
>> Hey,
>> 
>> From what you have written I'm guessing that in your schema.xml file, you
>> have defined the field manu to be of type  "text", which is good for
>> keyword
>> searches, as the text type indexes on whitespace, i.e. Dell Inc. is
>> indexed
>> as dell, inc. so keyword searches matches either dell or inc. But when
>> you
>> want to facet on a particular field, you want exact matches regardless of
>> whitespace in between. In such cases its a good idea to use the string
>> type.
>> Let me illustrate with an example based on my settings:
>> 
>> Here are my fields:
>> 
>>    <!-- Core Fields -->
>>    <field name="id" type="string" indexed="true" stored="true"
>> required="true" />
>>    <field name="name" type="text" indexed="true" stored="true"/>
>>    <field name="manu" type="text" indexed="true" stored="true"/>
>>    <field name="sport" type="text" indexed="true" stored="true" />
>>    <field name="type" type="text" indexed="true" stored="true" />
>>    <field name="desc" type="text" indexed="true" stored="true" />
>>    <field name="ldesc" type="text" indexed="true" stored="true" />
>> 
>>    <!-- default text Field for searching -->
>>    <field name="text" type="text" indexed="true" stored="false"
>> multiValued="true"/>
>> 
>>    <!-- exact string fields for faceting -->
>>    <field name="sport_exact" type="string" indexed="true" stored="false"
>> />
>>    <field name="manu_exact" type="string" indexed="true" stored="false"
>> />
>>    <field name="type_exact" type="string" indexed="true" stored="false"
>> />
>> 
>>    <copyField source="manu" dest="text"/>
>>    <copyField source="name" dest="text"/>
>>    <copyField source="sport" dest="text"/>
>>    <copyField source="desc" dest="text"/>
>>    <copyField source="ldesc" dest="text"/>
>>    <copyField source="type" dest="text"/>
>> 
>>    <copyField source="manu" dest="manu_exact"/>
>>    <copyField source="sport" dest="sport_exact"/>
>>    <copyField source="type" dest="type_exact"/>
>> 
>> So, when doing keyword searches I use the <field name="text"...> to
>> search
>> in all the fields, as I copyField all the fields onto the field named
>> text.
>> But, for faceting I use the exact fields, which are of type string and
>> don't
>> split on whitespace.
>> 
>> 
>> Anshu
>> 
>> On Wed, Jun 3, 2009 at 1:50 AM, Bny Jo <bnykjo@yahoo.com> wrote:
>> 
>>>
>>> Hello,
>>>
>>>  I am wondering why solr is returning a manufacturer name field ( Dell,
>>> Inc) as Dell one result and Inc another result. Is there a way to facet
>>> a
>>> field which have space or delimitation on them?
>>>
>>> query.addFacetField("manu");
>>> query.setFacetMinCount(1);
>>>        query.setIncludeScore(true);
>>>  List<FacetField> facetFieldList=qr.getFacetFields();
>>>            for(FacetField facetField: facetFieldList){
>>>                System.out.println(facetField.toString()
>>> +"Manufactures");
>>>                }
>>> And it returns
>>> -----------------
>>> [manu:[dell (5), inc (5), corp (1), sharp (1), sonic (1), view (1),
>>> viewson
>>> (1), vizo (1)]]
>>>
>>>
>>>
>>>
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/How-to-avoid-space-on-facet-field-tp23840037p23847742.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
>       
> 

-- 
View this message in context: http://www.nabble.com/How-to-avoid-space-on-facet-field-tp23840037p23850245.html
Sent from the Solr - User mailing list archive at Nabble.com.


Mime
View raw message