kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Craig <alexcrai...@gmail.com>
Subject Kafka Streams endless rebalancing
Date Thu, 09 Apr 2020 13:33:41 GMT
Hi all, I’ve got a Kafka Streams application running in a Kubernetes
environment.  The topology on this application has 2 aggregations (and
therefore 2 Ktables), both of which can get fairly large – the first is
around 200GB and the second around 500GB.  As with any K8s platform, pods
can occasionally get rescheduled or go down, which of course will cause my
application to rebalance.  However, what I’m seeing is the application will
literally spend hours rebalancing, without any errors being thrown or other
obvious causes for the frequent rebalances – all I can see in the logs is
an instance will be restoring a state store from the changelog topic, then
suddenly it will have its partitions revoked and begin the join-group
process all over again.  (I’m running 10 pods/instances of my app, and I
see this same pattern in each instance)  In some cases it never really
recovers from this rebalancing cycle – even after 12 hours or more - and
I’ve had to scale down the application completely and start over by purging
the application state and re-consuming from earliest on the source topic.
Interestingly, after purging and starting from scratch the application
seems to recover from rebalances pretty easily.

The storage I’m using is a NAS device, which admittedly is not particularly
fast.  (it’s using spinning disks and is shared amongst other tenants) As
an experiment, I’ve tried switching the k8s storage to an in-memory option
(this is at the k8s layer - the application is still using the same RocksDB
stores) to see if that helps.  As it turns out, I never have the rebalance
problem when using an in-memory persistence layer.  If a pod goes down, the
application spends around 10 - 15 minutes rebalancing and then is back to
processing data again.

At this point I guess my main question is: when I’m using the NAS storage
and the state stores are fairly large, could I be hitting some timeout
somewhere that isn’t allowing the restore process to complete, which then
triggers another rebalance?  In other words, the restore process is simply
taking too long given the amount of data needed to restore and the slow
storage?   I’m currently using Kafka 2.4.1, but I saw this same behavior in
2.3.  I am using a custom RocksDB config setter to limit off-heap memory,
but I’ve tried removing that and saw no difference in the rebalance
problem.  Again, no errors that I’m seeing or anything else in the logs
that seems to indicate why it can never finish rebalancing.  I’ve tried
turning on DEBUG logging but I’m having a tough time sifting through the
amount of log messages, though I’m still looking.

If anyone has any ideas I would appreciate it, thanks!

Alex C

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message