spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandy Ryza (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-5581) When writing sorted map output file, avoid open / close between each partition
Date Wed, 04 Feb 2015 02:39:34 GMT
Sandy Ryza created SPARK-5581:
---------------------------------

             Summary: When writing sorted map output file, avoid open / close between each
partition
                 Key: SPARK-5581
                 URL: https://issues.apache.org/jira/browse/SPARK-5581
             Project: Spark
          Issue Type: Improvement
    Affects Versions: 1.3.0
            Reporter: Sandy Ryza


{code}
      // Bypassing merge-sort; get an iterator by partition and just write everything directly.
      for ((id, elements) <- this.partitionedIterator) {
        if (elements.hasNext) {
          val writer = blockManager.getDiskWriter(
            blockId, outputFile, ser, fileBufferSize, context.taskMetrics.shuffleWriteMetrics.get)
          for (elem <- elements) {
            writer.write(elem)
          }
          writer.commitAndClose()
          val segment = writer.fileSegment()
          lengths(id) = segment.length
        }
      }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message