lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kai loofi <kailo...@gmail.com>
Subject [Issue] omitNorms ignored in DefaultIndexingChain.getOrAddField method
Date Fri, 31 May 2019 23:28:40 GMT
Hello,

I am a Lucene user and have been trying to write a search platform on top
of lucene using version 6.6.1. I ran into some weird behavior and wanted to
seek opinions from the community. I noticed that norms were being created
even when I set *omitNorms=true* in the fieldTypes. I chased the issue and
found that the method *getOrAddField* tries to create a *FieldInfo *object
in the 1st pass. By default this object has omitNorms to false. The method
sets the *indexOptions *as specified in the fieldType on this newly created
object but doesn't do the same for *omitNorms.* This effectively overrides
this flag which creates issues down the line.

Here's the code snippet for the method with the *fieldInfos.getOrAdd* call

private PerField getOrAddField(String name, IndexableFieldType
fieldType, boolean invert) {

  // Make sure we have a PerField allocated
  final int hashPos = name.hashCode() & hashMask;
  PerField fp = fieldHash[hashPos];
  while (fp != null && !fp.fieldInfo.name.equals(name)) {
    fp = fp.next;
  }

  if (fp == null) {
    // First time we are seeing this field in this segment

    *FieldInfo fi = fieldInfos.getOrAdd(name);*
    // Messy: must set this here because e.g.
FreqProxTermsWriterPerField looks at the initial
    // IndexOptions to decide what arrays it must create).  Then, we
also must set it in
    // PerField.invert to allow for later downgrading of the index options:
    *fi.setIndexOptions(fieldType.indexOptions()*);

    fp = new PerField(fi, invert);

    ...

 The *getOrAdd *method below instantiates a new object with omitNorms
set to false as the 4th parameter.

/** Create a new field, or return existing one. */
public FieldInfo getOrAdd(String name) {
  FieldInfo fi = fieldInfo(name);
  if (fi == null) {
    // This field wasn't yet added to this in-RAM
    // segment's FieldInfo, so now we get a global
    // number for this field.  If the field was seen
    // before then we'll get the same name and number,
    // else we'll allocate a new one:
    final int fieldNumber = globalFieldNumbers.addOrGet(name, -1,
DocValuesType.NONE, 0, 0);
    fi = new FieldInfo(name, fieldNumber, false, false, false,
IndexOptions.NONE, DocValuesType.NONE, -1, new HashMap<>(), 0, 0);
    assert !byName.containsKey(fi.name);
    globalFieldNumbers.verifyConsistent(Integer.valueOf(fi.number),
fi.name, DocValuesType.NONE);
    byName.put(fi.name, fi);
  }

  return fi;
}

I was thinking of opening this as a bug on lucene but would like get some
feedback and make sure if I am not missing anything. Thanks in advance.

Regards,
Ishan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message