hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sahil Takiar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17638) SparkDynamicPartitionPruner loads all partition metadata into memory
Date Thu, 28 Sep 2017 19:13:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184690#comment-16184690
] 

Sahil Takiar commented on HIVE-17638:
-------------------------------------

CC: [~janulatha]

> SparkDynamicPartitionPruner loads all partition metadata into memory
> --------------------------------------------------------------------
>
>                 Key: HIVE-17638
>                 URL: https://issues.apache.org/jira/browse/HIVE-17638
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Sahil Takiar
>
> The {{SparkDynamicPartitionPruner}} first loads the contents of each partition pruning
file into memory, and then prunes all the partitions from the {{MapWork}}. This can cause
increased memory pressure on the HoS Remote Driver because it requires loading all the partition
metadata into memory. It would be more efficient if pruning of partitions was done while scanning
the files, so that all the partition metadata doesn't need to be buffered in memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message