kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Jorgensen <ajorgen...@twitter.com.INVALID>
Subject Partition reassignment reversed
Date Tue, 02 Dec 2014 03:16:50 GMT
I unfortunately do not have any specific logs from these events but I will try and describe
the events as accurately as possible to give an idea of the problem I saw.

The odd behavior manifested itself when I bounced all of the kafka processes on each of the
servers in a 12 node cluster. A few weeks prior I did a partition reassignment to add four
new kafka brokers to the cluster. This cluster has 4 topics on it each with 350 partitions
each, a retention policy of 6 hours, and a replication factor of 1. Originally I attempted
to run a migration on all of the topics and partitions adding the 4 new nodes using the partition
reassignment tool. This seemed to cause a lot of network congestion and according to the logs
some of the nodes were having trouble talking to each other. The network congestion lasted
for the duration of the migration and began to get better toward the end. After the migration
I confirmed that data was being stored and served from the new brokers. Today I bounced each
of the kafka processes on each of the brokers to pick up a change made to the log4j properties.
After bouncing one processes I started seeing some strange errors on the four newer broker
nodes that looked like:

kafka.common.NotAssignedReplicaException: Leader 10 failed to record follower 7's position
0 for partition [topic-1,185] since the replica 7 is not recognized to be one of the assigned
replicas 10 for partition [topic-2,185]

and on the older kafka brokers the errors looked like:

[2014-12-01 17:06:04,268] ERROR [ReplicaFetcherThread-0-12], Error for partition [topic-1,175]
to broker 12:class kafka.common.UnknownException (kafka.server.ReplicaFetcherThread)

I proceeded to bounce the rest of the kafka processes and after bouncing the rest the errors
seemed to stop. It wasn’t until a few hours later I noticed that the amount of data stored
on the 4 new kafka brokers had dropped off significantly. When I ran a describe for the topics
in the errors it was clear that the assigned partitions had been reverted to a state prior
to the original migration to add the 4 new brokers. I am unsure of why bouncing the kafka
process would cause the state in zookeeper to get overwritten given that it had seemed to
have been working for the last few weeks until the process was restarted. My hunch is that
the controller keeps some state about the world pre-reassignment and removes that state after
it detects that the reassignment happened successfully. In this case the network congestion
on each of the brokers caused the controller not to get notified when all the reassignments
were completed and thus kept the pre-assignement state around. When the process was bounced
it read from zookeeper to get this state and reverted the existing scheme to the pre-assignment
state. Has this behavior been observed before? Does this sound like a logical understanding
of what happened in this case?

Andrew Jorgensen
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message