You can take a look of https://issues.apache.org/jira/browse/SPARK-12837


Yong

issues.apache.org
Executing a sql statement with a large number of partitions requires a high memory space for the driver even there are no requests to collect data back to the driver.




From: Bahubali Jain <bahubali@gmail.com>
Sent: Thursday, March 16, 2017 1:39 PM
To: user@spark.apache.org
Subject: Dataset : Issue with Save
 
Hi,
While saving a dataset using        mydataset.write().csv("outputlocation")                   I am running into an exception

"Total size of serialized results of 3722 tasks (1024.0 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)"

Does it mean that for saving a dataset whole of the dataset contents are being sent to driver ,similar to collect()  action?

Thanks,
Baahu