lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Summer Shire <shiresum...@gmail.com>
Subject Re: solr 4.7.2 mergeFactor/ Merge policy issue
Date Sat, 07 Mar 2015 01:20:58 GMT
Hi All,

Here’s more update on where I am at with this.
I enabled infoStream logging and quickly figured that I need to get rid of maxBufferedDocs.
So Erick you 
were absolutely right on that.
I increased my ramBufferSize to 100MB
and reduced maxMergeAtOnce to 3 and segmentsPerTier to 3 as well.
My config looks like this 

<indexConfig>
    <useCompoundFile>false</useCompoundFile>
    <ramBufferSizeMB>100</ramBufferSizeMB>

    <!--<maxMergeSizeForForcedMerge>9223372036854775807</maxMergeSizeForForcedMerge>-->
    <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
      <int name="maxMergeAtOnce">3</int>
      <int name="segmentsPerTier">3</int>
    </mergePolicy>
    <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
    <infoStream file=“/tmp/INFOSTREAM.txt”>true</infoStream>
  </indexConfig>

I am attaching a sample infostream log file.
In the infoStream logs though you an see how the segments keep on adding
and it shows (just an example )
allowedSegmentCount=10 vs count=9 (eligible count=9) tooBigCount=0

I looked at TieredMergePolicy.java to see how allowedSegmentCount is getting calculated
// Compute max allowed segs in the index
    long levelSize = minSegmentBytes;
    long bytesLeft = totIndexBytes;
    double allowedSegCount = 0;
    while(true) {
      final double segCountLevel = bytesLeft / (double) levelSize;
      if (segCountLevel < segsPerTier) {
        allowedSegCount += Math.ceil(segCountLevel);
        break;
      }
      allowedSegCount += segsPerTier;
      bytesLeft -= segsPerTier * levelSize;
      levelSize *= maxMergeAtOnce;
    }
    int allowedSegCountInt = (int) allowedSegCount;
and the minSegmentBytes is calculated as follows
 // Compute total index bytes & print details about the index
    long totIndexBytes = 0;
    long minSegmentBytes = Long.MAX_VALUE;
    for(SegmentInfoPerCommit info : infosSorted) {
      final long segBytes = size(info);
      if (verbose()) {
        String extra = merging.contains(info) ? " [merging]" : "";
        if (segBytes >= maxMergedSegmentBytes/2.0) {
          extra += " [skip: too large]";
        } else if (segBytes < floorSegmentBytes) {
          extra += " [floored]";
        }
        message("  seg=" + writer.get().segString(info) + " size=" + String.format(Locale.ROOT,
"%.3f", segBytes/1024/1024.) + " MB" + extra);
      }

      minSegmentBytes = Math.min(segBytes, minSegmentBytes);
      // Accum total byte size
      totIndexBytes += segBytes;
    }


any input is welcome. 


Mime
View raw message