lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shailendra Sharma" <shailendra.sha...@gmail.com>
Subject Re: Prioiritze new documents
Date Tue, 01 Jan 2008 18:26:24 GMT
>
> I got a large index and when searching for a term I want the newer
> documents be at the begining of the result set. I dont need a real order
> by time but lucene should prioritze the newer documents.
> I got the time of the document creation as a index-field but it takes
> very long if I would force lucene to sort by this field, so I want to
> avoid this and instead use a less sharp priority based ordering.
>
> Can anybody give me a hint?
>

I am not sure why you don't want to use SortField for your purpose. It's not
as bad option as you describe above. See, lucene already sorts the documents
on SCORE. Now for your problem what you additionally want is do sorting by
date if SCORE is same - for very large index (say of size N) you are just
adding O(N) complexity to the whole process... which is linear in nature.
Because you would be doing sorting within similar score buckets (say average
size is k) - this would take k * lg k. For very large index there would be
lots of such buckets (say B), we can safely assume B >>> k. So total
additional complexity is B k lg k. For B >>> k, k lg k would be equivalent
to 1. Hence additional complexity is only B or N (a better proxy).

I hope I am clear.

Feel free to comment.

P.S: In our application we are already using SortField based mechanism...
it's not adding lots of extra CPU cycles to the whole process of search.

Regards,
Shailendra

On Dec 30, 2007 8:40 PM, Grant Ingersoll <gsingers@apache.org> wrote:

> I think it is the case that Luke doesn't show the boosts, see http://www.gossamer-threads.com/lists/lucene/java-user/32020?search_string=Luke%20boost%201;#32020
>
>
>
> On Dec 30, 2007, at 7:38 AM, Dominik Bruhn wrote:
>
> > Hy,
> > a solution came into my mind:
> > Every document gets boosted by a integer which I increment each time I
> > add a new document to the index. So the newer documents should get a
> > bigger boost than the older ones.
> >
> > I tried it out and ran into a problem:
> > Although I set the Boost via doc.setBoost (value) for each document
> > before writing it to the index it doesnt change anything. Even worse
> > if I look at the index using Luke (Version 0.7.1) each document got
> > a boost of 1 not of the value supplied.
> >
> > Who can help me?
> >
> > Thanks
> > --
> > Dominik Bruhn
> > mailto: dominik@dbruhn.de
>
> --------------------------
> Grant Ingersoll
> http://lucene.grantingersoll.com
> http://www.lucenebootcamp.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Shailendra Sharma
+91-988-011-3066

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message