lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: Single value vs multi value setting in tokenized field
Date Mon, 17 Jan 2011 05:21:25 GMT

I'm not a big fan of putting all fields in a single field (bye bye dismax, as 
you say), but if you are asking whether doing it via copyField or "directly" is 
will make a difference - not really.
If you do it with copyField, you still get to keep your individual fields, which 
could serve you down the road, but the index is bigger because you have 
duplicate data for those fields.

Sematext :: :: Solr - Lucene - Nutch
Lucene ecosystem search ::

----- Original Message ----
> From: kenf_nc <>
> To:
> Sent: Sun, January 16, 2011 3:47:56 PM
> Subject: Single value vs multi value setting in tokenized field
> I have to support both general searches (free form text) and  directed
> searches (field:val field2:val). To do the general search I have a  field
> defined as:
>    <field name="content" type="text"  indexed="true" stored="false"
> termVectors="true" multiValued="true"  />
> and several copyField commands like:
>   <copyField  source="description" dest="content" />
>   <copyField source="title"  dest="content" />
>   <copyField source="tags" dest="content"  />
>   <copyField source="features" dest="content" />
> Note  that tags and features are multi-value themselves. So after indexing I
> have a  'general text' bucket with numerous (usually in the 20 to 30 range)
> rows of  strings. 
> My question is would it be better, for indexing speed and  search
> speed/quality, to concatenate all the text into a single string and  store it
> in "content" as one value? What are the implications on search  results? If
> Description is say a couple paragraphs of text and tags  are
> "Cuisine","Italian","Romantic" would the tags get lost in the muck of  the
> bigger text?
> One thing to keep in mind. I'm sure some of you are  going to say 'Dismax'
> and in some situations I will, but my index has  numerous document types that
> have vastly different schemas. Another document  may not have "title" and
> "features" but might have "recommendations" and  "location". In a general
> query it wouldn't make sense to include every  possible field in a dismax
> query, I don't even know what all the fields are,  new ones are added all the
> time.
> Has anyone got advice, suggestions on  this topic (blending directed search
> with general search)? 
> Thanks in  advance,
> Ken
> -- 
> View this message in context: 
> Sent  from the Solr - User mailing list archive at

View raw message