tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Eagles (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TEZ-3709) TezMerger is slow for high number of segments
Date Wed, 03 May 2017 22:51:04 GMT
Jonathan Eagles created TEZ-3709:
------------------------------------

             Summary: TezMerger is slow for high number of segments
                 Key: TEZ-3709
                 URL: https://issues.apache.org/jira/browse/TEZ-3709
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Jonathan Eagles
            Assignee: Jonathan Eagles


The below code is a bad performer at scale since it has to memcpy the whole list of segments
for each item in the batch instead of of just once per batch.
This is true for both computeBytesInMerges and getSegmentDescriptors.
{code}
for (int i = 0; i < batch; i++) {
  ArrayList#remove(0)
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message