spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bahubali Jain <bahub...@gmail.com>
Subject Re: Dataset : Issue with Save
Date Fri, 17 Mar 2017 02:34:38 GMT
Hi,
Was this not yet resolved?
Its a very common requirement to save a dataframe, is there a better way to
save a dataframe by avoiding data being sent to driver?.


*"Total size of serialized results of 3722 tasks (1024.0 MB) is bigger than
spark.driver.maxResultSize (1024.0 MB) "*
Thanks,
Baahu

On Fri, Mar 17, 2017 at 1:19 AM, Yong Zhang <java8964@hotmail.com> wrote:

> You can take a look of https://issues.apache.org/jira/browse/SPARK-12837
>
>
> Yong
> Spark driver requires large memory space for serialized ...
> <https://issues.apache.org/jira/browse/SPARK-12837>
> issues.apache.org
> Executing a sql statement with a large number of partitions requires a
> high memory space for the driver even there are no requests to collect data
> back to the driver.
>
>
>
> ------------------------------
> *From:* Bahubali Jain <bahubali@gmail.com>
> *Sent:* Thursday, March 16, 2017 1:39 PM
> *To:* user@spark.apache.org
> *Subject:* Dataset : Issue with Save
>
> Hi,
> While saving a dataset using       *
> mydataset.write().csv("outputlocation")  *                 I am running
> into an exception
>
>
>
> * "Total size of serialized results of 3722 tasks (1024.0 MB) is bigger
> than spark.driver.maxResultSize (1024.0 MB)" *
> Does it mean that for saving a dataset whole of the dataset contents are
> being sent to driver ,similar to collect()  action?
>
> Thanks,
> Baahu
>



-- 
Twitter:http://twitter.com/Baahu

Mime
View raw message