lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Audenaerde <rob.audenae...@gmail.com>
Subject force deletes - terms enum still has deleted terms?
Date Fri, 28 Sep 2018 12:40:41 GMT
Hi all,

We build a FST on the terms of our index by iterating the terms of the
readers for our fields, like this:

                        for (final LeafReaderContext ctx : leaves) {
                            final LeafReader leafReader = ctx.reader();

                            for (final String indexField : indexFields) {
                                final Terms terms =
leafReader.terms(indexField);
                                // If the field does not exist in this
reader, then we get null, so check for that.
                                if (terms != null) {
                                    final TermsEnum termsEnum =
terms.iterator();

However, it sometimes the building of the FST seems to find terms that are
from documents that are deleted. This is what we expect, checking the
javadocs.

So, now we switched the IndexWriter to a config with a TieredMergePolicy
with: setForceMergeDeletesPctAllowed(0).

When calling indexWriter.forceMergeDeletes(true) we expect that there will
be no more deletes. However, the deleted terms still sometimes appear. We
use the DirectoryReader.openIfChanged() to refresh the reader before
iterating the terms.

Are we forgetting something?

Thanks in advance.
Rob Audenaerde

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message