lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Commented: (LUCENE-1335) Correctly handle concurrent calls to addIndexes, optimize, commit
Date Sat, 23 Aug 2008 12:47:44 GMT


Michael McCandless commented on LUCENE-1335:

Just to clarify: close(waitForMerges=false) and rollback() make
an ongoing addIndexes[NoOptimize](dirs) abort, but wait for
addIndexes(readers) to finish. It'd be nice if they make any
addIndexes* abort for a quick shutdown, but that's for later.

True, agreed.

commit() and commit(long) use the read-write lock to wait for
a running addIndexes. "committing" is used to serialize commit()
calls. Why isn't it also used to serialize commit(long) calls?

It's because commit() calls prepareCommit(), which throws a
"prepareCommit was already called" exception if the commit was already
prepared.  Whereas commit(long) doesn't call prepareCommit (eg, it
doesn't need to flush).  Without this, I was hitting exceptions in one
of the tests that calls commit() from multiple threads at the same

    * In finishMerges, acquireRead and releaseRead are both called.
      Isn't addIndexes allowed again?
This is to make sure any just-started addIndexes cleanly finish or
abort before we enter the wait loop.  I was seeing cases where the
wait loop would think no more merges were pending, but in fact an
addIndexes was just getting underway and was about to start merging.
It's OK if a new addIndexes call starts up, because it'll be forced to
check the stop conditions (closing=true or stopMerges=true) and then
abort the merges.  I'll add comments to this effect.

    * In copyExternalSegments, merges involving external segments
      are carried out in foreground. So why the changes? To relax
      that assumption? But other part still makes the assumption.
This method has always carried out merges in the FG, but it's in fact
possible that a BG merge thread on finishing a previous merge may pull
a merge involving external segments.  So I changed this method to wait
for all such BG merges to complete, because it's not allowed to return
until there are no more external segments in the index.

It is tempting to fully schedule these external merges (ie allow them
to run in BG), but there is a problem: if there is some error on doing
the merge, we need that error to be thrown in the FG thread calling
copyExternalSegments (so the transcaction above unwinds).  (Ie we
can't just stuff these external merges into the merge queue then wait
for their completely).  So I think we need to leave is as is?

    * addIndexes(readers) should optimize before startTransaction, no?

I had to move the optimize() inside the transaction because it could
happen that after the optimize() is finished, some other thread sneaks
in a call to addIndexes* and gets additional segments added to the
index such that by the time we start the transaction we now have more
than one segment.

But this change will tie up more disk space than addIndexes used to
(since it will also rollback the optimize on hitting an exception).
Really I just need to pre-acquire the write lock, then I can leave
optimize() out of the transaction.  I'll do that.

    * The newly added method segString(dir) in SegmentInfos is
      not used anywhere.

Yeah I was using this for internal debugging, and I think it's
generally useful for future debugging, so I left it in.

> Correctly handle concurrent calls to addIndexes, optimize, commit
> -----------------------------------------------------------------
>                 Key: LUCENE-1335
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.3, 2.3.1
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.4
>         Attachments: LUCENE-1335.patch, LUCENE-1335.patch, LUCENE-1335.patch
> Spinoff from here:

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message