kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammed Manna <manme...@gmail.com>
Subject Re: Cluster in weird state: no leaders no ISR for all topics, but it works!
Date Mon, 05 Jun 2017 08:38:10 GMT
Hi,

I setup a fresh cluster (3-brokers, 3-keepers) and created a topic
according to your settings - obviously the log directories are kept
separeate e.g. (var/lib/zookeeper2 and var/lib/zookeeper3) and not to
mention, the myid files for every zookeeper to identify themselves in the
ensemble. Cannot see any issues.

Do you mind trying it out with a fresh quorum information - please do the
following:

1) Delete the zookeeper logs from the log and dataLogDir locations
2) in the zookeeper.properties for Kafka ensure that the zookeeper config
specifies different log directories for every zookeeper.
3) Restart the cluster in the following pattern:
    a) Start your zookeepers and allow ~5s waiting time between each,
    b) Start your kafka brokers and allow ~5s waiting time between each
other.

4) Create a topic using console utility and do a --describe on the topic -


let us know.

KR,



On 2 June 2017 at 13:55, Del Barrio, Alberto <
alberto.delbarrio@360dialog.com> wrote:

> So, I fixed the problem doing a rolling restart, and after some checks
> seems there was no data loss.
>
> On 1 June 2017 at 17:57, Del Barrio, Alberto <
> alberto.delbarrio@360dialog.com> wrote:
>
> > I might give it a try tomorrow. The reason for having so large init and
> > sync limit times is because in the past our ZK cluster was storing large
> > amount of data, and lower values were not enough for the server syncs
> when
> > restarting zk processes.
> >
> > On 1 June 2017 at 17:52, Mohammed Manna <manmedia@gmail.com> wrote:
> >
> >> Cool - I will try and take a look into this - Meanwhile, do you mind
> >> awfuly
> >> to change the following and see if things improve?
> >>
> >> tickTime = 1000
> >> initLimit=3
> >> syncLimit=5
> >>
> >> On 1 June 2017 at 16:49, Del Barrio, Alberto <
> >> alberto.delbarrio@360dialog.com> wrote:
> >>
> >> > Here are the configs you were asking for:
> >> >
> >> > Zookeeper:
> >> > tickTime=1000
> >> > initLimit=2000
> >> > syncLimit=1000
> >> > dataDir=/var/lib/zookeeper
> >> > clientPort=2181
> >> > server.3=10.0.0.3:2888:3888
> >> > server.2=10.0.0.2:2888:3888
> >> > server.1=10.0.0.1:2888:3888
> >> >
> >> >
> >> > Kafka broker (for one of them):
> >> > broker.id=10
> >> > listeners=PLAINTEXT://10.0.0.4:9092
> >> > num.network.threads=3
> >> > num.io.threads=8
> >> > socket.send.buffer.bytes=102400
> >> > socket.receive.buffer.bytes=102400
> >> > socket.request.max.bytes=104857600
> >> > log.dirs=/var/lib/kafka
> >> > num.partitions=2
> >> > num.recovery.threads.per.data.dir=1
> >> > zookeeper.connect=10.0.0.1:2181,10.0.0.2:2181,10.0.0.3:2181/kafka
> >> > zookeeper.connection.timeout.ms=6000
> >> >
> >> > In general they're pretty much the default ones.
> >> > I can see in Zookeeper the kafka brokers connected to it and
> exchanging
> >> > data...
> >> >
> >> > Thanks for your help and time.
> >> >
> >> > On 1 June 2017 at 17:32, Mohammed Manna <manmedia@gmail.com> wrote:
> >> >
> >> > > Could you please share your broker/zookeeper/topic configs ?
> >> > >
> >> > > On 1 June 2017 at 16:18, Del Barrio, Alberto <
> >> > > alberto.delbarrio@360dialog.com> wrote:
> >> > >
> >> > > > I tried creating the topic and results are very similar to the
> >> current
> >> > > > situation: there are not ISR and no leader for any of the
> >> partitions,
> >> > but
> >> > > > now kafka-topics shows *Leader: none* when for all the other
> >> topics, it
> >> > > > shows *Leader: -1*
> >> > > >
> >> > > >
> >> > > > On 1 June 2017 at 17:05, Mohammed Manna <manmedia@gmail.com>
> wrote:
> >> > > >
> >> > > > > I had a similar situation, but only 1 of my ZKs was struggling
-
> >> but
> >> > > > since
> >> > > > > the ISR synching time is configurable I was confident to
bounce
> 1
> >> ZK
> >> > > at a
> >> > > > > time and it worked out.
> >> > > > > does it happen even when you create a new topic with a
> >> > > > > replication:partition ration of 1?
> >> > > > >
> >> > > > > i meant, 3 replicas, 3 partitions :)
> >> > > > >
> >> > > > > On 1 June 2017 at 15:58, Del Barrio, Alberto <
> >> > > > > alberto.delbarrio@360dialog.com> wrote:
> >> > > > >
> >> > > > > > Hi Mohammed,
> >> > > > > >
> >> > > > > > thanks for your answer.
> >> > > > > > The ZK cluster is not located in the servers where
Kafka runs
> >> but
> >> > in
> >> > > > > other
> >> > > > > > 3 different machines. This ZK cluster is used by several
other
> >> > > services
> >> > > > > > which are not reporting problems.
> >> > > > > > As you suggested, I haven't tried restarting the kafka-server
> >> > > processes
> >> > > > > > because there's no leader for topic partitions, so
I don't
> know
> >> > what
> >> > > > will
> >> > > > > > happen. Never been in a similar situation with Kafka
after
> some
> >> > years
> >> > > > > using
> >> > > > > > it.
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > On 1 June 2017 at 16:29, Mohammed Manna <manmedia@gmail.com>
> >> > wrote:
> >> > > > > >
> >> > > > > > > Hi Alberto,
> >> > > > > > >
> >> > > > > > > Usually this means that the leader election/replica
syncing
> >> > > couldn't
> >> > > > be
> >> > > > > > > successful and the zookeeper logs should be able
to show
> this
> >> > > > > information
> >> > > > > > > too. The leader -1 is what worries me. For your
case (3
> broker
> >> > > > > cluster),
> >> > > > > > I
> >> > > > > > > am assuming you have done the cluster configuration
to have
> 1
> >> > > > > > > broker-zookeeper setup ?
> >> > > > > > > If that's the case, you should be able to bounce
1 zookeeper
> >> at a
> >> > > > time
> >> > > > > > and
> >> > > > > > > see if that resolves the issue.
> >> > > > > > >
> >> > > > > > > That said, have you restarted  your servers since
this issue
> >> > > > surfaced?
> >> > > > > > >
> >> > > > > > > On 1 June 2017 at 14:11, Del Barrio, Alberto <
> >> > > > > > > alberto.delbarrio@360dialog.com> wrote:
> >> > > > > > >
> >> > > > > > > > Hi all,
> >> > > > > > > >
> >> > > > > > > > I'm experiencing an issue which I don't know
how to solve,
> >> so
> >> > I'm
> >> > > > > > trying
> >> > > > > > > to
> >> > > > > > > > find some guidance on the topic.
> >> > > > > > > >
> >> > > > > > > > I have a cluster composed by 3 servers, one
broker per
> >> server
> >> > > > running
> >> > > > > > > Kafka
> >> > > > > > > > 0.10.0.1-1 which runs in production with
around 100
> topics,
> >> > most
> >> > > of
> >> > > > > > them
> >> > > > > > > > divided in several partitions and replicated
always
> between
> >> 2
> >> > > > > servers.
> >> > > > > > > > Suddenly I've notice when looking at my topics
(with
> >> > kafka-topics
> >> > > > > tool)
> >> > > > > > > > that no one of them have a leader (Leader:
-1) and the
> list
> >> of
> >> > > ISR
> >> > > > > > > appears
> >> > > > > > > > empty for all the topics.
> >> > > > > > > > So they look something like:
> >> > > > > > > >
> >> > > > > > > > Topic:mytopic   PartitionCount:3    ReplicationFactor:2
> >> > > Configs:
> >> > > > > > > > retention.ms=86400000
> >> > > > > > > >     Topic: mytopic    Partition: 0    Leader:
-1
> >> Replicas:
> >> > > 30,10
> >> > > > > > > Isr:
> >> > > > > > > >     Topic: mytopic    Partition: 1    Leader:
-1
> >> Replicas:
> >> > > 10,20
> >> > > > > > > Isr:
> >> > > > > > > >     Topic: mytopic    Partition: 2    Leader:
-1
> >> Replicas:
> >> > > 20,30
> >> > > > > > > Isr:
> >> > > > > > > >
> >> > > > > > > > However the applications using it are running
normally,
> >> > consumers
> >> > > > as
> >> > > > > > well
> >> > > > > > > > as producers.
> >> > > > > > > > The logs are not showing errors or weird
messages with the
> >> > > > exceptions
> >> > > > > > of
> >> > > > > > > > some
> >> > > > > > > > Failed to rename [/var/log/kafka/log-cleaner.log]
to
> >> > > > > > > > [/var/log/kafka/log-cleaner.log.2017-05-31-17]
> >> > > > > > > > which appear each few days.
> >> > > > > > > >
> >> > > > > > > > Now I would like to bring back to cluster
to a good state.
> >> I'm
> >> > > > afraid
> >> > > > > > of
> >> > > > > > > > restarting brokers because all of them are
supposed to be
> >> > leaders
> >> > > > for
> >> > > > > > > some
> >> > > > > > > > partitions, so if I restart them and there's
no leader I
> >> might
> >> > > > > > experience
> >> > > > > > > > data loss.
> >> > > > > > > >
> >> > > > > > > > Have you face any similar situation? Can
someone give me a
> >> > hint?
> >> > > > > > > >
> >> > > > > > > > Thanks in advance,
> >> > > > > > > > Alberto.
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > News, jobs, product releases, events.
> >> > > > > > Follow 360dialog on LinkedIn <https://www.linkedin.com/
> >> > > > company/360dialog
> >> > > > > >
> >> > > > > >  and Twitter <https://twitter.com/360dialog>.
> >> > > > > > Subscribe to our newsletter <http://www.360dialog.com/news
> >> letter/
> >> > >.
> >> > > > > >
> >> > > > > >
> >> > > > > > *Alberto del Barrio*DevOps Engineer
> >> > > > > >
> >> > > > > > <http://www.360dialog.com?utm_campaign=email-signature&utm_
> >> > > > > > content=-&utm_medium=email&utm_source=signature&utm_term=
> logo>
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > *Contact 360dialog*www.360dialog.com
> >> > > > > > <http://www.360dialog.com/?utm_campaign=email-signature&
> >> > > > > > utm_content=-&utm_medium=email&utm_source=signature&utm_
> >> term=url>
> >> > > > > > info@360dialog.com
> >> > > > > > +49-(0)30-6098-5953-0
> >> > > > > >
> >> > > > > > 360dialog GmbH, Saarbrücker Str. 36-38, 10405 Berlin,
Germany
> >> > > > > > Managing director: Roland Siebert
> >> > > > > > Commercial register: Charlottenbug, HRB 144188 B
> >> > > > > > VAT ID: DE815382679
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > News, jobs, product releases, events.
> >> > > > Follow 360dialog on LinkedIn <https://www.linkedin.com/
> >> > company/360dialog
> >> > > >
> >> > > >  and Twitter <https://twitter.com/360dialog>.
> >> > > > Subscribe to our newsletter <http://www.360dialog.com/newsletter/
> >.
> >> > > >
> >> > > >
> >> > > > *Alberto del Barrio*DevOps Engineer
> >> > > >
> >> > > > <http://www.360dialog.com?utm_campaign=email-signature&utm_
> >> > > > content=-&utm_medium=email&utm_source=signature&utm_term=logo>
> >> > > >
> >> > > >
> >> > > >
> >> > > > *Contact 360dialog*www.360dialog.com
> >> > > > <http://www.360dialog.com/?utm_campaign=email-signature&
> >> > > > utm_content=-&utm_medium=email&utm_source=signature&utm_term=url>
> >> > > > info@360dialog.com
> >> > > > +49-(0)30-6098-5953-0
> >> > > >
> >> > > > 360dialog GmbH, Saarbrücker Str. 36-38, 10405 Berlin, Germany
> >> > > > Managing director: Roland Siebert
> >> > > > Commercial register: Charlottenbug, HRB 144188 B
> >> > > > VAT ID: DE815382679
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > News, jobs, product releases, events.
> >> > Follow 360dialog on LinkedIn <https://www.linkedin.com/comp
> >> any/360dialog>
> >> >  and Twitter <https://twitter.com/360dialog>.
> >> > Subscribe to our newsletter <http://www.360dialog.com/newsletter/>.
> >> >
> >> >
> >> > *Alberto del Barrio*DevOps Engineer
> >> >
> >> > <http://www.360dialog.com?utm_campaign=email-signature&utm_
> >> > content=-&utm_medium=email&utm_source=signature&utm_term=logo>
> >> >
> >> >
> >> >
> >> > *Contact 360dialog*www.360dialog.com
> >> > <http://www.360dialog.com/?utm_campaign=email-signature&
> >> > utm_content=-&utm_medium=email&utm_source=signature&utm_term=url>
> >> > info@360dialog.com
> >> > +49-(0)30-6098-5953-0
> >> >
> >> > 360dialog GmbH, Saarbrücker Str. 36-38, 10405 Berlin, Germany
> >> > Managing director: Roland Siebert
> >> > Commercial register: Charlottenbug, HRB 144188 B
> >> > VAT ID: DE815382679
> >> >
> >>
> >
> >
> >
> > --
> > News, jobs, product releases, events.
> > Follow 360dialog on LinkedIn <https://www.linkedin.com/company/360dialog
> >
> >  and Twitter <https://twitter.com/360dialog>.
> > Subscribe to our newsletter <http://www.360dialog.com/newsletter/>.
> >
> >
> > *Alberto del Barrio*DevOps Engineer
> >
> >
> > <http://www.360dialog.com?utm_campaign=email-signature&utm_
> content=-&utm_medium=email&utm_source=signature&utm_term=logo>
> >
> >
> >
> > *Contact 360dialog*www.360dialog.com
> > <http://www.360dialog.com/?utm_campaign=email-signature&
> utm_content=-&utm_medium=email&utm_source=signature&utm_term=url>
> > info@360dialog.com
> > +49-(0)30-6098-5953-0
> >
> > 360dialog GmbH, Saarbrücker Str. 36-38, 10405 Berlin, Germany
> > Managing director: Roland Siebert
> > Commercial register: Charlottenbug, HRB 144188 B
> > VAT ID: DE815382679
> >
>
>
>
> --
> News, jobs, product releases, events.
> Follow 360dialog on LinkedIn <https://www.linkedin.com/company/360dialog>
>  and Twitter <https://twitter.com/360dialog>.
> Subscribe to our newsletter <http://www.360dialog.com/newsletter/>.
>
>
> *Alberto del Barrio*DevOps Engineer
>
> <http://www.360dialog.com?utm_campaign=email-signature&utm_
> content=-&utm_medium=email&utm_source=signature&utm_term=logo>
>
>
>
> *Contact 360dialog*www.360dialog.com
> <http://www.360dialog.com/?utm_campaign=email-signature&
> utm_content=-&utm_medium=email&utm_source=signature&utm_term=url>
> info@360dialog.com
> +49-(0)30-6098-5953-0
>
> 360dialog GmbH, Saarbrücker Str. 36-38, 10405 Berlin, Germany
> Managing director: Roland Siebert
> Commercial register: Charlottenbug, HRB 144188 B
> VAT ID: DE815382679
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message