lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <>
Subject [jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers
Date Sun, 18 May 2014 20:15:38 GMT


Shai Erera commented on LUCENE-5618:

In SegmentReader.initDocValuesProducers, when there are no DV updates,
can't you just init dvp right off (not lazily)? Because up above we
only call it if FIS.hasDocValues.

Ooops, you're right. Will fix!

I think what Rob meant by the double-lookup is we should just call
dvFields.get(field) first, and only if that's null do we do the logic
to initialize it. Ie, the common case here is retrieving a DV field
that's already loaded.

This is the code of getNumeric():

NumericDocValues dvs = (NumericDocValues) dvFields.get(field);
if (dvs == null) {
  DocValuesProducer dvProducer = dvProducersByField.get(field);
  assert dvProducer != null;
  dvs = dvProducer.getNumeric(fi);
  dvFields.put(field, dvs);

It seems already optimized to do one lookup in the common case?

bq. Why would the newDVFiles contain e.getKey()? Aren't we only writing the new generation
update here?

Notice that the key is the fieldNumber (Integer) and not the generation (Long). I modified
SegmentCommitInfo to track the files per fieldNumber instead of generation, to avoid future
issues, and also I think it lets us be more flexible, i.e. easier back-compat support if we
will want to change things again. Therefore it could be that the existing files mapping contain
a fieldNumber which we just rewrote (updated), and hence the {{if}}.

bq. Also the indent is off a bit.

Thanks, I'll fix.

> DocValues updates send wrong fieldinfos to codec producers
> ----------------------------------------------------------
>                 Key: LUCENE-5618
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Assignee: Shai Erera
>            Priority: Blocker
>             Fix For: 4.9
>         Attachments: LUCENE-5618.patch, LUCENE-5618.patch, LUCENE-5618.patch
> Spinoff from LUCENE-5616.
> See the example there, docvalues readers get a fieldinfos, but it doesn't contain the
correct ones, so they have invalid field numbers at read time.
> This should really be fixed. Maybe a simple solution is to not write "batches" of fields
in updates but just have only one field per gen? 
> This removes many-many relationships and would make things easy to understand.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message