lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: newbie Q regarding schema configuration
Date Tue, 20 Jun 2006 07:58:30 GMT

: so.. my first question in schema.xml, can you have a composite key as
: the 'uniquekey' field, or do i need to do this on the client side?

at the moment this would need to be done client site, but you're not the
first person to ask so i've added it to the TaskList ... it doesn't seem
like it would be too hard.

: can you have complex types which are multivalued?
: I'd like to store something like
: a tag-name with a corresponding tag-weighting.

There's nothing like that built into Solr - the best way to model that
would probably be to use the term frequency to represent the weight - you
could have an analyzer that parsed input like...

   "blue state"^2 "democrat"^1 "john kerry"^5

...and converted it into a stream of tokens like...

   [blue state] [blue state] [democrat] [john kerry] [john kerry]...

..kind of kludgy, but that's the best mechanism Lucene has at the moment
(there are plans to add more generic term attributes, but that's still
currently a design thing)

: can you do sum(*) type queries in lucene/solr? it is efficient ? or
: are you better having a 2nd index which has these sum(*) values in it
: and keep it up to date instead.

sum's across multiple documents, or sums of values in a single document?
in the later case, you don't need a seperate index, just another field.
in the former case it's really a question of what sets of documents you
want sums across? .. if it's all of them then you could just store that
info in a flat file, or a special metadata document in your index ..

if what you want is more of a run-time calculation then you can certainly
do it in a custom request handler (and you can use a SolrCache and a
custom CacheRegnirator to make sure the values are cached for as long as
the searcher is open, and autowarmed when a new one is opened).  Generally
the best way to do math operations on sets of documents in Lucene is using
the FieldCache, and this is certainly available to Lucene request


View raw message