spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kant kodali <kanth...@gmail.com>
Subject What is the difference between forEachAsync vs forEachPartitionAsync?
Date Mon, 03 Apr 2017 03:36:01 GMT
Hi all,

What is the difference between forEachAsync vs forEachPartitionAsync? I
couldn't find any comments from the Javadoc. If I were to guess here is
what I would say but please correct me if I am wrong.

forEachAsync just iterate through values from all partitions one by one in
an Async Manner

forEachPartitionAsync: Fan out each partition and run the lambda for each
partition in parallel across different workers. The lambda here will
Iterate through values from that partition one by one in Async manner

Is this right? or am I completely wrong?

Thanks!

Mime
View raw message