lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: Sorting
Date Tue, 05 Aug 2008 19:38:33 GMT
Hey Andre,

The reason the javadoc says the field should not be tokenized stems from 
the issue you point out. What you want to do is possible of course, but 
making the Lucene code change would complicate a process that can be 
quite memory and cpu intensive on large collections. Done right, it 
might make a good patch though.

A compromise that you can make outside of the Lucene code is to index a 
separate field with the same contents but untokenized. Sorting on this 
field instead, Lucene will treat "North Carolina" as one token and sort 
as you'd expect. The downside to this approach is that you will have to 
juggle the two fields in the future.

- Mark

Andre Rubin wrote:
> Hi there!
> I'm new to Lucene, so forgive any misconceptions on my part.
> I created an Index and now I want to search on it based on a field.
> The field is a String field and Field.Store.YES and
> Field.Index.TOKENIZED. No problems with the search.
> Now, I wanted to sort the results, and according to the Sort javadoc
> the field "should not be tokenized". But I decided to try it anyway,
> and it worked. However, the results showed that the tokens were
> sorted, not the full string in the field.
> Just to make myself more clear, here's an example. Let's say I have
> these strings indexed:
> "North Carolina"
> "British Columbia"
> "Canada"
> Now I search (with sort) for the token 'c*'
> The result I get is (sorted by the token found):
> 1) Canada
> 2) North Carolina
> 3) British Columbia
> The result I wanted was (sorted by the whole String)"
> 1) British Columbia
> 2) Canada
> 3) North Carolina
> Is there a way to do this?
> Another option would be to sort the index itself, since this field is
> the only field that we'd be searching on. But I'm just guessing here,
> cause I have no idea if this is possible at all!
> Thanks,
> Andre
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message