spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Cheah (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-3466) Limit size of results that a driver collects for each action
Date Tue, 21 Oct 2014 01:57:33 GMT

    [ https://issues.apache.org/jira/browse/SPARK-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14177826#comment-14177826
] 

Matt Cheah commented on SPARK-3466:
-----------------------------------

I got caught up in some other things, so I haven't had a chance to deeply look into this.
Off the top of my head though I was considering communicating back the size of the partitions
to the driver, using ResultTask, or something similar. We'd want a way for the driver to know
what the combined size of the computation results will be.

I'm still unfamiliar with the codebase so I would have to do some more investigation for that
approach, but if you believe your approach to also be sound feel free to point me to places
in the code to look. However, my hunch is that limiting the serialization stream on the executors
might not always work. If the data distribution across the executors at the end of the job
is extremely unbalanced, then one executor might serialize back a giant block that won't fit
in the stream even though that block could fit inside the driver program. Or (more unlikely
if this were configured correctly) if the combined bytes in all of the executors is more than
the driver can handle, despite the fact that the serialization stream of the individual executors
are only near-capacity, then we could still run into this issue.

> Limit size of results that a driver collects for each action
> ------------------------------------------------------------
>
>                 Key: SPARK-3466
>                 URL: https://issues.apache.org/jira/browse/SPARK-3466
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Matei Zaharia
>            Assignee: Matthew Cheah
>
> Right now, operations like {{collect()}} and {{take()}} can crash the driver with an
OOM if they bring back too many data. We should add a {{spark.driver.maxResultSize}} setting
(or something like that) that will make the driver abort a job if its result is too big. We
can set it to some fraction of the driver's memory by default, or to something like 100 MB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message