spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lysiane Bouchard <bouchard.lysi...@gmail.com>
Subject Re: The speed of Spark streaming reading data from kafka stays low
Date Mon, 13 Mar 2017 13:38:23 GMT
Hi,

If you didn't already, I would recommend to verify the following
configuration properties:
spark.streaming.kafka.maxRatePerPartition
spark.streaming.backpressure.enabled
spark.streaming.receiver.maxRate

See documentation for your Spark Streaming version here
<https://spark.apache.org/docs/1.6.2/configuration.html#spark-streaming> for
more details.

Good luck !

On Mon, Mar 13, 2017 at 3:20 AM, churly lin <churylin@gmail.com> wrote:

> HI all:
> I am using spark *streaming(1.6.2)* + *kafka(0.10.1.0)*.  to be specific,
> I read events from kafka topic by *spark streaming direct approach*.
> kafka: *1 topic 10 partitions*.
> spark streaming: *10 executors *according to 10 kafka partitions. The*
> batch window time* is set 60s.
>
> After running, the spark streaming processing time is about 20s, much less
> than the batch window size. but no matter how the input rate of the kafka
> producer changed(3000 events/sec, 4000 events/sec, 6000 events/sec), the
> input rate of spark streaming(kafka consumer) was always about 3000
> events/sec. which means the spark streaming(kafka consumer side) couldn't
> catch up with the kafka producer side. So, is there a way to increase the
> throughput of the *spark streaming + kafka(direct approach) *system?
>
> I hava tried to increase the kafka partitions from 10 to 20, accordingly,
> increase the executors from 10 to 20, but didn't work.
>
> ​
>
> ​
>
> Thanks.
>
>

Mime
View raw message