kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Taylor Gautier <tgaut...@tagged.com>
Subject Re: Exception causing Kafka to crash
Date Fri, 07 Oct 2011 17:47:58 GMT
Oh, I think I found it :

zk.sessiontimeout.ms

On Fri, Oct 7, 2011 at 10:44 AM, Taylor Gautier <tgautier@tagged.com> wrote:

> Thanks Jay,
>
> I looked around but it wasn't immediately obvious to me what setting to
> change to reduce the zk ephemeral timeout - does kafka configure zk itself -
> if so then I'm looking for a kafka setting?  I didn't see anything
> appropriateā€¦
>
>
>
> On Fri, Oct 7, 2011 at 9:21 AM, Jay Kreps <jay.kreps@gmail.com> wrote:
>
>> It occurs to me that we could do a better job with this error. There are
>> really three things that might have happened (1) you restarted kafka
>> within
>> the zk timeout, in which case as far as zk is concerned your old broker
>> still exists...this is weird but actually correct behavior, (2) you have
>> two
>> brokers with the same id, (3) zk has a bug and is not deleting ephemeral
>> nodes.
>>
>> I think if we just improved the error message to explain this we would
>> have
>> happier users, as is it requires slightly deep knowledge of zk to
>> understand
>> why this happens.
>>
>> -Jay
>>
>> On Fri, Oct 7, 2011 at 7:35 AM, Mathias Herberts <
>> mathias.herberts@gmail.com
>> > wrote:
>>
>> > If you abort Kafka (killing the JVM for example) and restart it,
>> > depending on the zookeeper timeout you've used, it might occur that
>> > the ephemeral node create by the broker has not yet been removed by
>> > ZK.
>> >
>> > If this is the case, Kafka will detect that there is a znode conflict
>> > and kill itself.
>> >
>> > This is what your logs seem to imply:
>> >
>> > [2011-10-03 15:33:22,229] INFO conflict in /brokers/ids/0 data:
>> > 10.98.20.109-1317681202194:10.98.20.109:9092 stored data:
>> > 10.98.20.109-1317268078266:10.98.20.109:9092 (kafka.utils.ZkUtils$)
>> >
>> > Try to either wait for more than the ZK timeout prior to restarting
>> > Kafka, or lower the ZK timeout so the ephemeral node is indeed gone
>> > when you restart Kafka.
>> >
>> > Mathias.
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message