lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <>
Subject Re: sort by field length
Date Thu, 26 Aug 2010 16:11:32 GMT
  On 5/24/2010 6:30 AM, Sascha Szott wrote:
> Hi folks,
> is it possible to sort by field length without having to (redundantly) 
> save the length information in a seperate index field? At first, I 
> thought to accomplish this using a function query, but I couldn't find 
> an appropriate one.

I have a slightly different need related to this, though it may turn out 
that what Sascha wants is similar.  I would like to understand my data 
better so I can improve my schema.  I need to do some data mining that 
is (to my knowledge) difficult or impossible with the source database.  
Performance is irrelevant, as long as it finishes eventually.  
Completing in less than an hour would be nice.

I would do this on a test system with much lower performance and memory 
(4GB) than my production servers, as a single index instead of multiple 
shards.  When it finishes building, the entire test index is likely to 
be about 75GB.

What I'm after is an output that would look very much like faceting, but 
I want it to show document counts associated with field length (for a 
simple string) and number of terms (for a tokenized field) instead of 
field value.  Can Solr do that, and if so, what do I need to have 
enabled in the schema to get it?  Would branch_3x be enough, or would 
trunk be better?


View raw message