lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments
Date Tue, 18 Jan 2011 18:36:46 GMT


Michael McCandless commented on LUCENE-2324:

The branch is looking very nice!!  Very clean :)

Random comments:

Why does DW.anyDeletions need to be sync'd?

Missing headers on at least DocumentsWriterPerThreadPool,

IWC.setIndexerThreadPool's javadoc is stale.

On ThreadAffinityDWTP... it may be better if we had a single queue,
where threads wait in line, if no DWPT is available?  And when a DWPT
finishes it then notifies any waiting threads?  (Ie, instead of queue-per-DWPT).

I see the fieldInfos.update(dwpt.getFieldInfos()) (in
DW.updateDocument) -- is there a risk that two threads bring a new
field into existence at the same time, but w/ different config?  Eg
one doc omitsTFAP and the other doesn't?  Or, on flush, does each DWPT
use its private FieldInfos to correctly flush the segment?  (Hmm: do
we seed each DWPT w/ the original FieldInfos created by IW on init?).

How are we handling the case of open IW, do delete-by-term but no
added docs?

Does DW.pushDeletes really need to sync on IW?  BufferedDeletes is
sync'd already.

DW.substractFlushedDocs is mis-spelled (not sure it's used though).

In DW.deleteTerms... shouldn't we skip a DWPT if it has no buffered

> Per thread DocumentsWriters that write their own private segments
> -----------------------------------------------------------------
>                 Key: LUCENE-2324
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael Busch
>            Assignee: Michael Busch
>            Priority: Minor
>             Fix For: Realtime Branch
>         Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch,
LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch,
lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out
> See LUCENE-2293 for motivation and more details.
> I'm copying here Mike's summary he posted on 2293:
> Change the approach for how we buffer in RAM to a more isolated
> approach, whereby IW has N fully independent RAM segments
> in-process and when a doc needs to be indexed it's added to one of
> them. Each segment would also write its own doc stores and
> "normal" segment merging (not the inefficient merge we now do on
> flush) would merge them. This should be a good simplification in
> the chain (eg maybe we can remove the *PerThread classes). The
> segments can flush independently, letting us make much better
> concurrent use of IO & CPU.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message