lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andres de la Peña <adelap...@stratio.com>
Subject Paging with EarlyTerminatingSortingCollector
Date Thu, 14 Apr 2016 09:39:54 GMT
Hi all,

Is it possible to page over results when using
an EarlyTerminatingSortingCollector?

I'm using the following code with Lucene 5.5.0 to read results in pages of
10 documents each:

/** The Lucene field name */
private static final String FIELD_NAME = "id";

/** The Lucene field type */
private static final FieldType FIELD_TYPE = new FieldType();
static {
    FIELD_TYPE.setTokenized(true);
    FIELD_TYPE.setOmitNorms(true);
    FIELD_TYPE.setIndexOptions(IndexOptions.DOCS);
    FIELD_TYPE.setNumericType(FieldType.NumericType.INT);
    FIELD_TYPE.setDocValuesType(DocValuesType.NUMERIC);
    FIELD_TYPE.setStored(true);
    FIELD_TYPE.freeze();
}

public static void main(String[] args) throws Exception {

    // Sort to be used both with merge policy and queries
    Sort sort = new Sort(new SortedNumericSortField(FIELD_NAME,
SortField.Type.INT));

    // Create directory
    RAMDirectory directory = new RAMDirectory();

    // Setup merge policy
    TieredMergePolicy tieredMergePolicy = new TieredMergePolicy();
    SortingMergePolicy sortingMergePolicy = new
SortingMergePolicy(tieredMergePolicy, sort);

    // Setup index writer
    IndexWriterConfig indexWriterConfig = new IndexWriterConfig(new
SimpleAnalyzer());
    indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);
    indexWriterConfig.setMergePolicy(sortingMergePolicy);
    IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig);

    // Index values
    for (int i = 1; i <= 1000; i++) {
        Document document = new Document();
        document.add(new IntField(FIELD_NAME, i, FIELD_TYPE));
        indexWriter.addDocument(document);
    }

    // Force index merge to ensure early termination
    indexWriter.forceMerge(1, true);
    indexWriter.commit();

    // Create index searcher
    IndexReader reader = DirectoryReader.open(directory);
    IndexSearcher searcher = new IndexSearcher(reader);

    // Paginated read
    int pageSize = 10;
    FieldDoc pageStart = null;
    while (true) {

        System.out.println(String.format("\nCollecting page starting
at: %s", pageStart));

        Query query = new MatchAllDocsQuery();

        TopFieldCollector tfc = TopFieldCollector.create(sort,
pageSize, pageStart, true, false, false);
        EarlyTerminatingSortingCollector collector = new
EarlyTerminatingSortingCollector(tfc, sort, pageSize, sort);
        searcher.search(query, collector);
        ScoreDoc[] scoreDocs = tfc.topDocs().scoreDocs;
        for (ScoreDoc scoreDoc : scoreDocs) {
            pageStart = (FieldDoc) scoreDoc;
            Document document = searcher.doc(scoreDoc.doc);
            System.out.println(String.format("FOUND %s -> %s",
document, scoreDoc));
        }

        System.out.println(String.format("Terminated early: %s",
collector.terminatedEarly()));

        if (scoreDocs.length < pageSize) {
            break;
        }
    }

    // Close
    reader.close();
    indexWriter.close();
    directory.close();
}


But the query for the second page doesn't return any results. However, I
get the expected results when I don't wrap the TopFieldCollector with the
EarlyTerminatingSortingCollector.

Is there something I am missing? Is EarlyTerminatingSortingCollector not
compatible with paging?

Thanks in advance,

-- 
Andrés de la Peña

Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message