spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Goel <deic...@gmail.com>
Subject Re: [Spark 2.x Core] .collect() size limit
Date Sat, 28 Apr 2018 16:48:15 GMT
There is something as *virtual memory*

On Sat, 28 Apr 2018, 21:19 Stephen Boesch, <javadba@gmail.com> wrote:

> Do you have a machine with  terabytes of RAM?  afaik collect() requires
> RAM - so that would be your limiting factor.
>
> 2018-04-28 8:41 GMT-07:00 klrmowse <klrmowse@gmail.com>:
>
>> i am currently trying to find a workaround for the Spark application i am
>> working on so that it does not have to use .collect()
>>
>> but, for now, it is going to have to use .collect()
>>
>> what is the size limit (memory for the driver) of RDD file that .collect()
>> can work with?
>>
>> i've been scouring google-search - S.O., blogs, etc, and everyone is
>> cautioning about .collect(), but does not specify how huge is huge... are
>> we
>> talking about a few gigabytes? terabytes?? petabytes???
>>
>>
>>
>> thank you
>>
>>
>>
>> --
>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
>

Mime
View raw message