kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prakash Gowri Shankor <prakash.shan...@gmail.com>
Subject Re: LeaderNotAvailableException in 0.8.1.1
Date Mon, 16 Jun 2014 01:22:25 GMT
yes, I gave it several minutes.


On Sat, Jun 14, 2014 at 2:18 PM, Michael G. Noll <michael@michael-noll.com>
wrote:

> Have you given Kafka some time to re-elect a new leader for the
> "missing" partition when you re-try steps 1-5?
>
> See here:
> > If you do, you should be able to go through steps
> > 1-8 without seeing LeaderNotAvailableExceptions (you may need to give
> > Kafka some time to re-elect the remaining, second broker as the new
> > leader for the first broker's partitions though).
>
> Best,
> Michael
>
>
>
> On 06/12/2014 08:43 PM, Prakash Gowri Shankor wrote:
> > So if we go back to the 2 broker case, I tried your suggestion with
> > replication-factor 2
> >
> > ./kafka-topics.sh  --topic test2 --create  --partitions 3 --zookeeper
> > localhost:2181 --replication-factor
> >
> > When i repeat steps 1-5 i still see the exception. When i go to step 8 (
> > back to 2 brokers ), I dont see it.
> > Here is my topic description:
> >
> > ./kafka-topics.sh --describe --topic test2 --zookeeper localhost:2181
> >
> > Topic:test2 PartitionCount:3 ReplicationFactor:2 Configs:
> >
> > Topic: test2 Partition: 0 Leader: 1 Replicas: 1,0 Isr: 1,0
> >
> > Topic: test2 Partition: 1 Leader: 0 Replicas: 0,1 Isr: 0,1
> >
> > Topic: test2 Partition: 2 Leader: 1 Replicas: 1,0 Isr: 1,0
> >
> >
> > On Wed, Jun 11, 2014 at 3:20 PM, Michael G. Noll <
> > michael+storm@michael-noll.com> wrote:
> >
> >> In your second case (1-broker cluster and putting your laptop to sleep)
> >> these exceptions should be transient and disappear after a while.
> >>
> >> In the logs you should see ZK session expirations (hence the
> >> initial/transient exceptions, which in this case are expected and ok),
> >> followed by new ZK sessions being established.
> >>
> >> So this case is (should?) be very different from your case number 1.
> >>
> >> --Michael
> >>
> >>
> >>> On 11.06.2014, at 23:13, Prakash Gowri Shankor <
> >> prakash.shankor@gmail.com> wrote:
> >>>
> >>> Thanks for your response Michael.
> >>>
> >>> In step 3, I am actually stopping the entire cluster and restarting it
> >>> without the 2nd broker. But I see your point. When i look in
> >>> /tmp/kafka-logs-2 ( which is the log dir for the 2nd broker ) I see it
> >>> holds test2-1 ( ie 1st partition of test2 topic ).
> >>> For /tmp/kafka-logs ( which is the log dir for the first broker ) I see
> >> it
> >>> holds test2-0 and test2-2 ( 0th and 2nd partition of test2 topic ).
> >>> So it would seem that kafka is missing the leader for partition 1 and
> >> hence
> >>> throwing the exception on the producer side.
> >>> Let me try your replication suggestion.
> >>>
> >>> While all of the above might explain the exception in the case of 2
> >>> brokers, there are still times when I see it with just a single broker.
> >>> In this case, I start from a normal working cluster with 1 broker only.
> >>> Then I either put my machine into sleep/hibernation. On wake, I do
> >> shutdown
> >>> the cluster ( for sanity ) and restart.
> >>> On restart, I start seeing this exception. In this case i only have one
> >>> broker. I still create the topic the way i described earlier.
> >>> I understand this is not the ideal production topology, but its
> annoying
> >> to
> >>> see it during development.
> >>>
> >>> Thanks
> >>>
> >>>
> >>> On Wed, Jun 11, 2014 at 1:40 PM, Michael G. Noll <
> >> michael@michael-noll.com>
> >>> wrote:
> >>>
> >>>> Prakash,
> >>>>
> >>>> you are configure the topic with a replication factor of only 1, i.e.
> no
> >>>> additional replica beyond "the original one".  This replication
> setting
> >>>> of 1 means that only one of the two brokers will ever host the
> (single)
> >>>> replica -- which is implied to also be the leader in-sync replica --
> of
> >>>> a given partition.
> >>>>
> >>>> In step 3 you are disabling one of the two brokers.  Because this
> >>>> stopped broker is the only broker that hosts one or more of the 3
> >>>> partitions you configured (I can't tell which partition(s) it is, but
> >>>> you can find out by --describe'ing the topic), your Kafka cluster --
> >>>> which is now running in degraded state -- will miss the leader of
> those
> >>>> affected partitions.  And because you set the replication factor to
1,
> >>>> the remaining, second broker will not and will never take over the
> >>>> leadership of those partitions from the stopped broker.  Hence you
> will
> >>>> keep getting the LeaderNotAvailableException's until you restart the
> >>>> stopped broker in step 7.
> >>>>
> >>>> So to me it looks as if the behavior of Kafka is actually correct and
> as
> >>>> expected.
> >>>>
> >>>> If you want to "rectify" your test setup, try increasing the
> replication
> >>>> factor from 1 to 2.  If you do, you should be able to go through steps
> >>>> 1-8 without seeing LeaderNotAvailableExceptions (you may need to give
> >>>> Kafka some time to re-elect the remaining, second broker as the new
> >>>> leader for the first broker's partitions though).
> >>>>
> >>>> Hope this helps,
> >>>> Michael
> >>>>
> >>>>
> >>>>
> >>>>> On 06/11/2014 07:49 PM, Prakash Gowri Shankor wrote:
> >>>>> yes,
> >>>>> here are the steps:
> >>>>>
> >>>>> Create topic as : ./kafka-topics.sh  --topic test2 --create
> >>>> --partitions 3
> >>>>> --zookeeper localhost:2181 --replication-factor 1
> >>>>>
> >>>>> 1) Start cluster with 2 brokers, 3 consumers.
> >>>>> 2) Dont start any producer
> >>>>> 3) Shutdown cluster and disable one broker from starting
> >>>>> 4) restart cluster with 1 broker, 3 consumers
> >>>>> 5) Start producer and send messages. I see this exception
> >>>>> 6) Shutdown cluster.
> >>>>> 7) Enable 2nd broker.
> >>>>> 8) Restart cluster with 2 brokers, 3 consumer and the one producer
> and
> >>>> send
> >>>>> messages. Now I dont see the exception.
> >>>>
> >>>>
> >>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message