spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From freedafeng <>
Subject What does do?
Date Thu, 13 Aug 2015 16:13:19 GMT
I am running a spark job with only two operations: mapPartition and then
collect(). The output data size of mapPartition is very small. One integer
per partition. I saw there is a stage 2 for this job that runs this java
program. I am not a java programmer. Could anyone please let me know what
this java program does? or simply how to get rid of this from running, or at
least get it run faster? The collect() call is not important to me. All the
work was done in mapPartition which sends out data to a k-v store. It's sth
like foreachPartition. But I cannot get foreachPartition() to run somehow.
Spark 1.1.1.


View this message in context:
Sent from the Apache Spark Developers List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message