lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Barry Coughlan <b.coughl...@gmail.com>
Subject Iterating TermsEnum for Long field produces zero values at the end
Date Mon, 17 Nov 2014 15:39:54 GMT
Hi all,

I'm using 4.10.2. I have a Long "id" field. Each document has one "id"
value. I am creating a look-up between Lucene's internal document id and my
"id" values by enumerating the inverted index:

    private long[] cacheDocIds() throws IOException {
        long[] ourIds = new long[reader.maxDoc()];

        Bits liveDocs = MultiFields.getLiveDocs(reader);
        Fields fields = MultiFields.getFields(reader);
        Terms terms = fields.terms("id");

        TermsEnum iterator = terms.iterator(null);
        BytesRef bytesRef = null;
        while ((bytesRef = iterator.next()) != null) {
            DocsEnum docsEnum = iterator.docs(liveDocs, null,
DocsEnum.FLAG_NONE);

            int luceneId = docsEnum.nextDoc();
            long ourId = NumericUtils.prefixCodedToLong(bytesRef);
            System.out.println(luceneId + " " + ourId);
            ourIds[luceneId] = ourId;
        }

        return ourIds;
    }

With 5 documents (1, 2, 3, 4, 5) I get this output from the above code:

0 1
1 2
2 3
3 4
4 5
0 0
0 0
0 0

I don't understand why there are three zeroes at the end.

- reader.maxDoc is 5 and no documents have been deleted.
- I have tried this with a varying number of documents and there are always
three zeroes at the end.
- I tried changing version to Lucene 4.10.0 and Lucene 4.9 and the same
behavior occurs.

I can work around this with but I'm just curious if this behavior is
expected?

Regards,
Barry

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message