spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <msegel_had...@hotmail.com>
Subject Re: Spark data frame
Date Tue, 22 Dec 2015 21:26:01 GMT
Dean, 

RDD in memory and then the collect() resulting in a collection, where both are alive at the
same time. 
(Again not sure how Tungsten plays in to this… ) 

So his collection can’t be larger than 1/2 of the memory allocated to the heap. 

(Unless you have allocated swap…, right?) 

> On Dec 22, 2015, at 12:11 PM, Dean Wampler <deanwampler@gmail.com> wrote:
> 
> You can call the collect() method to return a collection, but be careful. If your data
is too big to fit in the driver's memory, it will crash.
> 
> Dean Wampler, Ph.D.
> Author: Programming Scala, 2nd Edition <http://shop.oreilly.com/product/0636920033073.do>
(O'Reilly)
> Typesafe <http://typesafe.com/>
> @deanwampler <http://twitter.com/deanwampler>
> http://polyglotprogramming.com <http://polyglotprogramming.com/>
> On Tue, Dec 22, 2015 at 1:09 PM, Gaurav Agarwal <gaurav130403@gmail.com <mailto:gaurav130403@gmail.com>>
wrote:
> We are able to retrieve data frame by filtering the rdd object . I need to convert that
data frame into java pojo. Any idea how to do that
> 
> 


Mime
View raw message