lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: IndexWriter updateDocument is removing doc from index
Date Fri, 16 Mar 2018 18:50:16 GMT
Yes you can add documents by calling updateDocument -- if no prior
documents matched the deletion Term you provide, nothing is deleted and
your new doc is added.

Hmm are you sure your 2nd update really updated and then added 12 new
docs?   Dropping segment 1 makes sense because you deleted the one doc
(from your first update) and Lucene drops 100% deleted segments.

But your 3rd segment should have hand 13 docs if you really added 12 new
docs and updated.

Mike McCandless

http://blog.mikemccandless.com

On Thu, Mar 15, 2018 at 9:17 AM, Bernd Fehling <
bernd.fehling@uni-bielefeld.de> wrote:

> While writing some tools to build and maintain lucene indexes I noticed
> some strange behavior during testing.
> A doc disappears from lucene index while using IndexWriter updateDocument.
>
> The API of lucene 6.4.2 states:
> "Updates a document by first deleting the document(s) containing term and
> then adding the new document. The delete and then add are atomic as seen
> by a reader on the same index (flush may happen only after the add)."
>
> I could reproduce it but it might be it works as designed and I have
> to call some "flush" after using updateDocument?
>
> Any known issue or pitfall with org.apache.lucene.index.IndexWriter.updateDocument
> ?
>
> Steps I took:
> - created a new lucene index with 8 docs and 1 segment
>   segment_0 with DelCount:0, DelGen:-1, numDocs:8, maxDocs:8
> - updated 1 doc in the index with updateDocument which results in
>   segment_0 with DelCount:1, DelGen:1, numDocs:7, maxDocs:8
>   segment_1 with DelCount:0, DelGen:-1, numDocs:1, maxDocs:1
> so far OK, but now:
> - updated again the same doc as before and added 12 new docs
>   segment_0 with DelCount:1, DelGen:1, numDocs:7, maxDocs:8
>   segment_2 with DelCount:0, DelGen:-1, numDocs:12, maxDocs:12
>
> The result is that segment_1 disappeared and therefore the updated
> document.
> Only the 7 docs of segment_0 and the 12 new added documents of segment_2.
>
> By the way, is it allowed to use updateDocument to also add new docs?
>
> Regrads
> Bernd
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message