spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Jung <>
Subject Re: Manually trigger RDD map function without action
Date Tue, 13 Jan 2015 01:13:05 GMT
Cody said "If you don't care about the value that your map produced (because
you're not already collecting or saving it), then is foreach more
appropriate to what you're doing?" but I can not see it from this thread.
Anyway, I performed small benchmark to test what function is the most
efficient way. And a winner is foreach(a => a) according to everyone's
expectations. Collect can cause OOM from driver and count is very slower
than the others. Thanks all.

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message