lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <>
Subject [jira] [Updated] (LUCENE-5189) Numeric DocValues Updates
Date Tue, 03 Sep 2013 10:49:52 GMT


Shai Erera updated LUCENE-5189:

    Attachment: LUCENE-5189.patch

Patch adds per-field support. I currently do that by adding a boolean 'isFieldUpdate' to SegWriteState
which is set to true only by ReaderAndLiveDocs. PerFieldDVF then peeks into that boolean and
if it's true, it reads the format name from FieldInfo.attributes() instead of relying on Codec.getPerFieldDVF().
If we'll eventually gen FieldInfos, there won't be a need for this boolean as PerFieldDVF
will get that from FI.dvGen.

So far all Codecs work. I had to remove an assert from SimpleText which tested that all fields
read from the file are in the state.fieldInfos. But it doesn't use that information, only
an assert. And SegCoreReader now passes to each DVProducer only the fields it needs to read.

Added some tests too.
> Numeric DocValues Updates
> -------------------------
>                 Key: LUCENE-5189
>                 URL:
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>         Attachments: LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch,
LUCENE-5189.patch, LUCENE-5189.patch
> In LUCENE-4258 we started to work on incremental field updates, however the amount of
changes are immense and hard to follow/consume. The reason is that we targeted postings, stored
fields, DV etc., all from the get go.
> I'd like to start afresh here, with numeric-dv-field updates only. There are a couple
of reasons to that:
> * NumericDV fields should be easier to update, if e.g. we write all the values of all
the documents in a segment for the updated field (similar to how livedocs work, and previously
> * It's a fairly contained issue, attempting to handle just one data type to update, yet
requires many changes to core code which will also be useful for updating other data types.
> * It has value in and on itself, and we don't need to allow updating all the data types
in Lucene at once ... we can do that gradually.
> I have some working patch already which I'll upload next, explaining the changes.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message