lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: IndexCommit.delete() outside of IndexDeletionPolicy
Date Wed, 06 Jun 2012 11:37:21 GMT
I think this use case makes sense; such logic (for a distributed / ref
counted deletion policy) would make a nice contribution ... it's the
"proper" way to delete commits when multiple nodes are in use (vs eg
using a timeout deletion policy).

You can actually do it today: call IndexWriter.deleteUnusedFiles.
That visits the deletion policy and then you have a chance to delete
commit points (it'd mean you have to set a real deletion policy on the
writer, which in turn goes and checks the reference counts across all

Mike McCandless

On Wed, Jun 6, 2012 at 7:16 AM, Colin Goodheart-Smithe
<> wrote:
> I was looking at the Lucene API for IndexCommit and noticed that the
> JavaDoc states that
> *'Decision that a commit-point should be deleted is taken by the
> IndexDeletionPolicy<>
> in
> effect and therefore this should only be called by its
> onInit()<>
>  or onCommit()<>
>  methods.'*
> (
>  )
> I was wondering why this is the case and whether deleting IndexCommits
> outside of a IndexDeletionPolicy is actually a bad idea?
> To put some context around this I am looking to implement a deletion policy
> which is independant of the IndexWriter commit and more dependant on
> Processes using particular Commit points being finished with it.
> The logic would look something like the following and state would be stored
> in something like ZooKeeper so I can have use of ephremal nodes and watcher
> events:
>   - IndexWriters would have a NoDeletionPolicy set
>   - Each time a process opens a session it registers an ephremal node
>   - The session is assigned the current (latest) commit point
>   - Each time a process removes the node (either through crashing or
>   having finished the job) a watch event is fired where a separate process
>   will delete the commit point the process was using if no other processes
>   are using the commit point and if it is not the latest commit point
> Processes may have fairly long running sessions so across all the processes
> a reasonable number of commit points might be in use.  I don't really want
> to have to wait for a commit from the IndexWriter (which may not happen for
> a while) to clear up the older commit points I no longer need.  Would this
> logic pose any issues given that it is going to be deleting Commit points
> outside of the IndexDeletionPolicy

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message