spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jin xing (JIRA)" <>
Subject [jira] [Created] (SPARK-24143) filter empty blocks when convert mapstatus to (blockId, size) pair
Date Wed, 02 May 2018 05:40:00 GMT
jin xing created SPARK-24143:

             Summary: filter empty blocks when convert mapstatus to (blockId, size) pair
                 Key: SPARK-24143
             Project: Spark
          Issue Type: New Feature
          Components: Spark Core
    Affects Versions: 2.3.0
            Reporter: jin xing

In current code(MapOutputTracker.convertMapStatuses), mapstatus are converted to (blockId,
size) pair for all blocks -- no matter the block is empty or not, which result in OOM when
there are lots of consecutive empty blocks, especially when adaptive execution is enabled.

(blockId, size) pair is only used in ShuffleBlockFetcherIterator to control shuffle-read and
only non-empty block request is sent. Can we just filter out the empty blocks in  MapOutputTracker.convertMapStatuses
and save memory?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message