spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kostas Kougios <>
Subject sc.parallelise to work more like a producer/consumer?
Date Tue, 28 Jul 2015 14:58:22 GMT
Hi, I am using sc.parallelise(...32k of items) several times for 1 job. Each
executor takes x amount of time to process it's items but this results in
some executors finishing quickly and staying idle till the others catch up.
Only after all executors complete the first 32k batch, the next batch is
send for processing.

Is there a way to make it work more as producer/consumer?


View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message