spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From klrmowse <>
Subject Re: [EXT] [Spark 2.x Core] .collect() size limit
Date Tue, 01 May 2018 15:49:36 GMT
okie, i may have found an alternate/workaround to using .collect() for what i
am trying to achieve...

initially, for the Spark application that i am working on, i would call
.collect() on two separate RDDs into a couple of ArrayLists (which was the
reason i was asking what the size limit on the driver is)

i need to map the 1st rdd to the 2nd rdd according to a computation/function
- resulting in key-value pairs;

it turns out, i don't need to call .collect() if i instead use
.zipPartitions() - to which i can pass the function to; 

i am currently testing it out...

thanks all for your responses

Sent from:

To unsubscribe e-mail:

View raw message