spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yangcheng Huang <yangcheng.hu...@huawei.com>
Subject Any limitations of spark.shuffle.spill?
Date Wed, 05 Nov 2014 18:04:38 GMT
Hi

One question about the power of spark.shuffle.spill -
(I know this has been asked several times :-)

Basically, in handling a (cached) dataset that doesn't fit in memory, Spark can spill it to
disk.

However, can I say that, when this is enabled, Spark can handle the situation faultlessly,
no matter -

(1)    How big the data set is (as compared to the available memory)

(2)    How complex the detailed calculation is being carried out
Can spark.shuffle.spill handle this perfectly?

Here we assume that (1) the disk space has no limitations and (2) the code is correctly written
according to the functional requirements.

The reason to ask this is, under such situations, I kept receiving warnings like "FetchFailed",
if memory usage reaches the limit.

Thanks
YC

Mime
View raw message