spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yong Zhang <java8...@hotmail.com>
Subject Re: Dataset : Issue with Save
Date Thu, 16 Mar 2017 19:49:27 GMT
You can take a look of https://issues.apache.org/jira/browse/SPARK-12837


Yong

Spark driver requires large memory space for serialized ...<https://issues.apache.org/jira/browse/SPARK-12837>
issues.apache.org
Executing a sql statement with a large number of partitions requires a high memory space for
the driver even there are no requests to collect data back to the driver.




________________________________
From: Bahubali Jain <bahubali@gmail.com>
Sent: Thursday, March 16, 2017 1:39 PM
To: user@spark.apache.org
Subject: Dataset : Issue with Save

Hi,
While saving a dataset using        mydataset.write().csv("outputlocation")              
    I am running into an exception

"Total size of serialized results of 3722 tasks (1024.0 MB) is bigger than spark.driver.maxResultSize
(1024.0 MB)"

Does it mean that for saving a dataset whole of the dataset contents are being sent to driver
,similar to collect()  action?

Thanks,
Baahu

Mime
View raw message