lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Getting list of committed documents
Date Fri, 11 Nov 2016 11:14:23 GMT
Hi lukes,

First, IW never "auto commits".  The maxBufferedDocs/RAMBufferSizeMB
settings control when IW moves the recently indexed documents from RAM
to disk, but that moving, which writes new segments files, does not
commit them.  It just writes them to disk, not visible yet to an
external reader (unless you open a near-real-time reader from IW),
until you explicitly call IW.commit.

Second, every IW operation returns a long sequence number, and so does
IW.commit, such that all sequence numbers <= the sequence number
returned from IW.commit "made it" into the index, and all other ops
did not make it.

You should be able to use this information to e.g. tell the channel
(e.g. a kafka queue) which offset your Lucene app has "durably"
consumed.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Nov 9, 2016 at 2:40 PM, lukes <mail2lokesh@gmail.com> wrote:
> Hi all,
>
>   I need some feedback on getting hold of documents which got committed
> during commit call on indexwriter. There are multiple threads which keeps on
> adding documents to indexWriter in parallel, and there's another thread
> which wakes up after n number of minutes and does the commit. Below are the
> points i need help on
>
> 1) How to disable the auto flush / commit ? On IndexWriterConfig i setted up
> setMaxBufferedDocs (IndexWriterConfig.DISABLE_AUTO_FLUSH) and also
> setRAMBufferSizeMB to high number arnd 128 MB. Is this correct or is there
> any other knob i need to play around with ?
>
> 2) How to find out which documents got committed during commit(So certain
> action can be done, like removing from channel, etc) ? I tried extending
> IndexWriter and @Override doAfterFlush, but i don't see any pointer to get
> handle of documents which made in this commit.
>
> any help is really appreciated.
>
> Regards.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Getting-list-of-committed-documents-tp4305258.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message