spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From thomas lavocat <>
Subject Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?
Date Tue, 05 Jun 2018 11:17:29 GMT

Thank's for your answer.

On 05/06/2018 11:24, Saisai Shao wrote:
> spark.streaming.concurrentJobs is a driver side internal 
> configuration, this means that how many streaming jobs can be 
> submitted concurrently in one batch. Usually this should not be 
> configured by user, unless you're familiar with Spark Streaming 
> internals, and know the implication of this configuration.

How can I find some documentation about those implications ?

I've experimented some configuration of this parameters and found out 
that my overall throughput is increased in correlation with this property.
But I'm experiencing scalability issues. With more than 16 receivers 
spread over 8 executors, my executors no longer receive work from the 
driver and fall idle.
Is there an explanation ?


View raw message