spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chang Chen <baibaic...@gmail.com>
Subject Performance of VectorizedRleValuesReader
Date Mon, 14 Sep 2020 02:25:00 GMT
Hi export

it looks like there is a hot spot in VectorizedRleValuesReader#readNextGroup
()

case PACKED:
  int numGroups = header >>> 1;
  this.currentCount = numGroups * 8;

  if (this.currentBuffer.length < this.currentCount) {
    this.currentBuffer = new int[this.currentCount];
  }
  currentBufferIdx = 0;
  int valueIndex = 0;
  while (valueIndex < this.currentCount) {
    // values are bit packed 8 at a time, so reading bitWidth will always work
    ByteBuffer buffer = in.slice(bitWidth);
    this.packer.unpack8Values(buffer, buffer.position(),
this.currentBuffer, valueIndex);
    valueIndex += 8;
  }


Per my profile, the codes will spend 30% time of readNextGrou() on slice ,
why we can't call slice out of the loop?

Mime
View raw message