spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Verma <rohit.ve...@rokittech.com>
Subject ToLocalIterator vs collect
Date Thu, 05 Jan 2017 10:39:39 GMT
Hi all,

I am aware that collect will return a list aggregated on driver, this will return OOM when
we have a too big list.
Is toLocalIterator safe to use with very big list, i want to access all values one by one.

Basically the goal is to compare two sorted rdds (A and B) to find top k entries missed in
B but there in A 

Rohit
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message