lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshuman Manur <anshuman_ma...@stragure.com>
Subject Re: How to avoid space on facet field
Date Wed, 03 Jun 2009 03:10:59 GMT
Hey,

>From what you have written I'm guessing that in your schema.xml file, you
have defined the field manu to be of type  "text", which is good for keyword
searches, as the text type indexes on whitespace, i.e. Dell Inc. is indexed
as dell, inc. so keyword searches matches either dell or inc. But when you
want to facet on a particular field, you want exact matches regardless of
whitespace in between. In such cases its a good idea to use the string type.
Let me illustrate with an example based on my settings:

Here are my fields:

   <!-- Core Fields -->
   <field name="id" type="string" indexed="true" stored="true"
required="true" />
   <field name="name" type="text" indexed="true" stored="true"/>
   <field name="manu" type="text" indexed="true" stored="true"/>
   <field name="sport" type="text" indexed="true" stored="true" />
   <field name="type" type="text" indexed="true" stored="true" />
   <field name="desc" type="text" indexed="true" stored="true" />
   <field name="ldesc" type="text" indexed="true" stored="true" />

   <!-- default text Field for searching -->
   <field name="text" type="text" indexed="true" stored="false"
multiValued="true"/>

   <!-- exact string fields for faceting -->
   <field name="sport_exact" type="string" indexed="true" stored="false" />
   <field name="manu_exact" type="string" indexed="true" stored="false" />
   <field name="type_exact" type="string" indexed="true" stored="false" />

   <copyField source="manu" dest="text"/>
   <copyField source="name" dest="text"/>
   <copyField source="sport" dest="text"/>
   <copyField source="desc" dest="text"/>
   <copyField source="ldesc" dest="text"/>
   <copyField source="type" dest="text"/>

   <copyField source="manu" dest="manu_exact"/>
   <copyField source="sport" dest="sport_exact"/>
   <copyField source="type" dest="type_exact"/>

So, when doing keyword searches I use the <field name="text"...> to search
in all the fields, as I copyField all the fields onto the field named text.
But, for faceting I use the exact fields, which are of type string and don't
split on whitespace.


Anshu

On Wed, Jun 3, 2009 at 1:50 AM, Bny Jo <bnykjo@yahoo.com> wrote:

>
> Hello,
>
>  I am wondering why solr is returning a manufacturer name field ( Dell,
> Inc) as Dell one result and Inc another result. Is there a way to facet a
> field which have space or delimitation on them?
>
> query.addFacetField("manu");
> query.setFacetMinCount(1);
>        query.setIncludeScore(true);
>  List<FacetField> facetFieldList=qr.getFacetFields();
>            for(FacetField facetField: facetFieldList){
>                System.out.println(facetField.toString() +"Manufactures");
>                }
> And it returns
> -----------------
> [manu:[dell (5), inc (5), corp (1), sharp (1), sonic (1), view (1), viewson
> (1), vizo (1)]]
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message