lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: SortingAtomicReader alternate to Tim-Sort...
Date Mon, 20 Apr 2015 07:26:52 GMT
I like these ideas, the int[] we are using today are wasteful. My only
concern about using the FixedBitSet is that it would make sorting each
postings list run in O(maxDoc) but maybe we can make it better by
using SparseFixedBitSet (added in 5.0, given your code snippets I
assume you are still on 4.x)?

I'm curious if you already performed any kind of benchmarking of this approach?


On Tue, Apr 14, 2015 at 2:07 PM, Ravikumar Govindarajan
<ravikumar.govindarajan@gmail.com> wrote:
> We were experimenting with SortingMergePolicy and came across an alternate
> solution to TimSort of postings-list using FBS & GrowableWriter.
>
> I have attached relevant code-snippet. It would be nice if someone can
> clarify whether it is a good idea to implement...
>
> public class SortingAtomicReader {
> …
> …
> class SortingDocsEnum {
>
> //Last 2 variables namely *newdoclist* & *olddocToFreq* are added in
> //constructor. It is assumed that these 2 variables are init during
> //merge start & they are then re-used till merge completes...
>
>
> public SortingDocsEnum(int maxDoc, final DocsEnum in, boolean withFreqs,
> final Sorter.DocMap docMap, FixedBitSet newdoclist, GrowableWriter
> olddocToFreq) throws IOException {
>
> ….
>
> …
>
> while (true) {
>
>   //Instead of Tim-Sorting as in existing code
>
>   doc = in.nextDoc();
>
>   int newdoc = docMap.oldToNew(doc);
>
>   newdoclist.set(newdoc);
>
>   if(withFreqs) {
>
>     olddocToFreq.set(doc, in.freq());
>
>   }
>
> }
>
>
> @Override
>
> public int nextDoc() throws IOException {
>
>   if (++docIt >= upto) {
>
>   return NO_MORE_DOCS;
>
>   }
>
>   currDoc = newdoclist.nextSetBit(++currDoc);
>
>   if(currDoc == -1) {
>
>     return NO_MORE_DOCS;
>
>   }
>
>   //clear the set-bit here before returning...
>
>   newdoclist.clear(currDoc);
>
>   return currDoc;
>
> }
>
>
> @Override
>
> public int freq() throws IOException {
>
>   if(withFreqs && docIt < upto) {
>
>   return (int)olddocToFreq.getMutable()
>
>                  .get(docMap.newToOld(currDoc));
>
>   }
>
>   return 1;
>
> }
>
> }



-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message