lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernd Fehling <bernd.fehl...@uni-bielefeld.de>
Subject IndexWriter updateDocument is removing doc from index
Date Thu, 15 Mar 2018 13:17:45 GMT
While writing some tools to build and maintain lucene indexes I noticed
some strange behavior during testing.
A doc disappears from lucene index while using IndexWriter updateDocument.

The API of lucene 6.4.2 states:
"Updates a document by first deleting the document(s) containing term and
then adding the new document. The delete and then add are atomic as seen
by a reader on the same index (flush may happen only after the add)."

I could reproduce it but it might be it works as designed and I have
to call some "flush" after using updateDocument?

Any known issue or pitfall with org.apache.lucene.index.IndexWriter.updateDocument ?

Steps I took:
- created a new lucene index with 8 docs and 1 segment
  segment_0 with DelCount:0, DelGen:-1, numDocs:8, maxDocs:8
- updated 1 doc in the index with updateDocument which results in
  segment_0 with DelCount:1, DelGen:1, numDocs:7, maxDocs:8
  segment_1 with DelCount:0, DelGen:-1, numDocs:1, maxDocs:1
so far OK, but now:
- updated again the same doc as before and added 12 new docs
  segment_0 with DelCount:1, DelGen:1, numDocs:7, maxDocs:8
  segment_2 with DelCount:0, DelGen:-1, numDocs:12, maxDocs:12

The result is that segment_1 disappeared and therefore the updated document.
Only the 7 docs of segment_0 and the 12 new added documents of segment_2.

By the way, is it allowed to use updateDocument to also add new docs?

Regrads
Bernd


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message