kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes Chow <...@chartbeat.com>
Subject Re: spikes in producer requests/sec
Date Wed, 12 Nov 2014 18:00:22 GMT

Thanks for suggestions everyone. Will get some more metrics into place 
and report back.

Wes

> Jun Rao <mailto:junrao@gmail.com>
> November 12, 2014 at 12:24 PM
> You can also look into the producer time metrics and see if the time 
> goes up before the spike of the request rate. If the time does go up, 
> you can look at the breakdown of the time. Any latency due to local 
> I/O will be included in the local time.
>
> Thanks,
>
> Jun
>
>
> Wes Chow <mailto:wes@chartbeat.com>
> November 11, 2014 at 1:50 PM
>
> We're seeing periodic spikes in req/sec rates across our nodes. Our 
> cluster is 10 nodes, and the topic has a replication factor of 3. We 
> push around 200k messages / sec into Kafka.
>
>
> The machines are running the most recent version of Kafka and we're 
> connecting via librdkafka. pingstream02-10 are using the CMS garbage 
> collector, but I switched pingstream01 to use G1GC under the theory 
> that maybe these were GC pauses. The graph shows that likely didn't 
> improve the situation.
>
> My next thought is that maybe this is the effect of log rolling. 
> Checking in the logs, I see a lot of this:
>
> [2014-11-11 13:46:45,836] 72952071 [ReplicaFetcherThread-0-7] INFO  
> kafka.log.Log  - Rolled new log segment for 'pings-342' in 3 ms.
> [2014-11-11 13:46:47,116] 72953351 [kafka-request-handler-0] INFO  
> kafka.log.Log  - Rolled new log segment for 'pings-186' in 2 ms.
> [2014-11-11 13:46:48,155] 72954390 [ReplicaFetcherThread-0-8] INFO  
> kafka.log.Log  - Rolled new log segment for 'pings-253' in 3 ms.
> [2014-11-11 13:46:48,408] 72954643 [ReplicaFetcherThread-0-4] INFO  
> kafka.log.Log  - Rolled new log segment for 'pings-209' in 3 ms.
> [2014-11-11 13:46:48,436] 72954671 [ReplicaFetcherThread-0-4] INFO  
> kafka.log.Log  - Rolled new log segment for 'pings-299' in 2 ms.
> [2014-11-11 13:46:48,687] 72954922 [kafka-request-handler-0] INFO  
> kafka.log.Log  - Rolled new log segment for 'pings-506' in 2 ms.
>
> The "pings" topic in question has 512 partitions, so it does this 512 
> times every so often. We have an effective retention period of a bit 
> less than 30 min, so rolling happens pretty frequently. Still, if I 
> assume worst case that rolling locks up the process for 2ms and there 
> are 512 rolls every few minutes, I'd expect halting to happen for 
> about a second at a time. The graphs seem to indicate much longer 
> dips, but it's hard for me to know if I'm looking at real data or some 
> sort of artifact.
>
> Fwiw, the producers are not reporting any errors, so it does not seem 
> like we're losing data.
>
> I'm new to Kafka. Should I be worried? If so, how should I be 
> debugging this?
>
> Thanks,
> Wes
>

Mime
View raw message