kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Del Barrio, Alberto" <alberto.delbar...@360dialog.com>
Subject Re: Cluster in weird state: no leaders no ISR for all topics, but it works!
Date Wed, 07 Jun 2017 13:35:40 GMT
Hi,

I have followed the instructions you detail and I could create topics,
which were getting a leader and were properly replicated.
I think the problem I experienced was due to some old temporary
communication problems between Kafka and Zookeeper. But that's only a guess.

Thanks a lot Mohammed for your time.

Alberto del Barrio.

On 5 June 2017 at 10:38, Mohammed Manna <manmedia@gmail.com> wrote:

> Hi,
>
> I setup a fresh cluster (3-brokers, 3-keepers) and created a topic
> according to your settings - obviously the log directories are kept
> separeate e.g. (var/lib/zookeeper2 and var/lib/zookeeper3) and not to
> mention, the myid files for every zookeeper to identify themselves in the
> ensemble. Cannot see any issues.
>
> Do you mind trying it out with a fresh quorum information - please do the
> following:
>
> 1) Delete the zookeeper logs from the log and dataLogDir locations
> 2) in the zookeeper.properties for Kafka ensure that the zookeeper config
> specifies different log directories for every zookeeper.
> 3) Restart the cluster in the following pattern:
>     a) Start your zookeepers and allow ~5s waiting time between each,
>     b) Start your kafka brokers and allow ~5s waiting time between each
> other.
>
> 4) Create a topic using console utility and do a --describe on the topic -
>
>
> let us know.
>
> KR,
>
>
>
> On 2 June 2017 at 13:55, Del Barrio, Alberto <
> alberto.delbarrio@360dialog.com> wrote:
>
> > So, I fixed the problem doing a rolling restart, and after some checks
> > seems there was no data loss.
> >
> > On 1 June 2017 at 17:57, Del Barrio, Alberto <
> > alberto.delbarrio@360dialog.com> wrote:
> >
> > > I might give it a try tomorrow. The reason for having so large init and
> > > sync limit times is because in the past our ZK cluster was storing
> large
> > > amount of data, and lower values were not enough for the server syncs
> > when
> > > restarting zk processes.
> > >
> > > On 1 June 2017 at 17:52, Mohammed Manna <manmedia@gmail.com> wrote:
> > >
> > >> Cool - I will try and take a look into this - Meanwhile, do you mind
> > >> awfuly
> > >> to change the following and see if things improve?
> > >>
> > >> tickTime = 1000
> > >> initLimit=3
> > >> syncLimit=5
> > >>
> > >> On 1 June 2017 at 16:49, Del Barrio, Alberto <
> > >> alberto.delbarrio@360dialog.com> wrote:
> > >>
> > >> > Here are the configs you were asking for:
> > >> >
> > >> > Zookeeper:
> > >> > tickTime=1000
> > >> > initLimit=2000
> > >> > syncLimit=1000
> > >> > dataDir=/var/lib/zookeeper
> > >> > clientPort=2181
> > >> > server.3=10.0.0.3:2888:3888
> > >> > server.2=10.0.0.2:2888:3888
> > >> > server.1=10.0.0.1:2888:3888
> > >> >
> > >> >
> > >> > Kafka broker (for one of them):
> > >> > broker.id=10
> > >> > listeners=PLAINTEXT://10.0.0.4:9092
> > >> > num.network.threads=3
> > >> > num.io.threads=8
> > >> > socket.send.buffer.bytes=102400
> > >> > socket.receive.buffer.bytes=102400
> > >> > socket.request.max.bytes=104857600
> > >> > log.dirs=/var/lib/kafka
> > >> > num.partitions=2
> > >> > num.recovery.threads.per.data.dir=1
> > >> > zookeeper.connect=10.0.0.1:2181,10.0.0.2:2181,10.0.0.3:2181/kafka
> > >> > zookeeper.connection.timeout.ms=6000
> > >> >
> > >> > In general they're pretty much the default ones.
> > >> > I can see in Zookeeper the kafka brokers connected to it and
> > exchanging
> > >> > data...
> > >> >
> > >> > Thanks for your help and time.
> > >> >
> > >> > On 1 June 2017 at 17:32, Mohammed Manna <manmedia@gmail.com>
wrote:
> > >> >
> > >> > > Could you please share your broker/zookeeper/topic configs ?
> > >> > >
> > >> > > On 1 June 2017 at 16:18, Del Barrio, Alberto <
> > >> > > alberto.delbarrio@360dialog.com> wrote:
> > >> > >
> > >> > > > I tried creating the topic and results are very similar
to the
> > >> current
> > >> > > > situation: there are not ISR and no leader for any of the
> > >> partitions,
> > >> > but
> > >> > > > now kafka-topics shows *Leader: none* when for all the other
> > >> topics, it
> > >> > > > shows *Leader: -1*
> > >> > > >
> > >> > > >
> > >> > > > On 1 June 2017 at 17:05, Mohammed Manna <manmedia@gmail.com>
> > wrote:
> > >> > > >
> > >> > > > > I had a similar situation, but only 1 of my ZKs was
> struggling -
> > >> but
> > >> > > > since
> > >> > > > > the ISR synching time is configurable I was confident
to
> bounce
> > 1
> > >> ZK
> > >> > > at a
> > >> > > > > time and it worked out.
> > >> > > > > does it happen even when you create a new topic with
a
> > >> > > > > replication:partition ration of 1?
> > >> > > > >
> > >> > > > > i meant, 3 replicas, 3 partitions :)
> > >> > > > >
> > >> > > > > On 1 June 2017 at 15:58, Del Barrio, Alberto <
> > >> > > > > alberto.delbarrio@360dialog.com> wrote:
> > >> > > > >
> > >> > > > > > Hi Mohammed,
> > >> > > > > >
> > >> > > > > > thanks for your answer.
> > >> > > > > > The ZK cluster is not located in the servers where
Kafka
> runs
> > >> but
> > >> > in
> > >> > > > > other
> > >> > > > > > 3 different machines. This ZK cluster is used
by several
> other
> > >> > > services
> > >> > > > > > which are not reporting problems.
> > >> > > > > > As you suggested, I haven't tried restarting the
> kafka-server
> > >> > > processes
> > >> > > > > > because there's no leader for topic partitions,
so I don't
> > know
> > >> > what
> > >> > > > will
> > >> > > > > > happen. Never been in a similar situation with
Kafka after
> > some
> > >> > years
> > >> > > > > using
> > >> > > > > > it.
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > On 1 June 2017 at 16:29, Mohammed Manna <manmedia@gmail.com
> >
> > >> > wrote:
> > >> > > > > >
> > >> > > > > > > Hi Alberto,
> > >> > > > > > >
> > >> > > > > > > Usually this means that the leader election/replica
> syncing
> > >> > > couldn't
> > >> > > > be
> > >> > > > > > > successful and the zookeeper logs should
be able to show
> > this
> > >> > > > > information
> > >> > > > > > > too. The leader -1 is what worries me. For
your case (3
> > broker
> > >> > > > > cluster),
> > >> > > > > > I
> > >> > > > > > > am assuming you have done the cluster configuration
to
> have
> > 1
> > >> > > > > > > broker-zookeeper setup ?
> > >> > > > > > > If that's the case, you should be able to
bounce 1
> zookeeper
> > >> at a
> > >> > > > time
> > >> > > > > > and
> > >> > > > > > > see if that resolves the issue.
> > >> > > > > > >
> > >> > > > > > > That said, have you restarted  your servers
since this
> issue
> > >> > > > surfaced?
> > >> > > > > > >
> > >> > > > > > > On 1 June 2017 at 14:11, Del Barrio, Alberto
<
> > >> > > > > > > alberto.delbarrio@360dialog.com> wrote:
> > >> > > > > > >
> > >> > > > > > > > Hi all,
> > >> > > > > > > >
> > >> > > > > > > > I'm experiencing an issue which I don't
know how to
> solve,
> > >> so
> > >> > I'm
> > >> > > > > > trying
> > >> > > > > > > to
> > >> > > > > > > > find some guidance on the topic.
> > >> > > > > > > >
> > >> > > > > > > > I have a cluster composed by 3 servers,
one broker per
> > >> server
> > >> > > > running
> > >> > > > > > > Kafka
> > >> > > > > > > > 0.10.0.1-1 which runs in production
with around 100
> > topics,
> > >> > most
> > >> > > of
> > >> > > > > > them
> > >> > > > > > > > divided in several partitions and replicated
always
> > between
> > >> 2
> > >> > > > > servers.
> > >> > > > > > > > Suddenly I've notice when looking at
my topics (with
> > >> > kafka-topics
> > >> > > > > tool)
> > >> > > > > > > > that no one of them have a leader (Leader:
-1) and the
> > list
> > >> of
> > >> > > ISR
> > >> > > > > > > appears
> > >> > > > > > > > empty for all the topics.
> > >> > > > > > > > So they look something like:
> > >> > > > > > > >
> > >> > > > > > > > Topic:mytopic   PartitionCount:3   
ReplicationFactor:2
> > >> > > Configs:
> > >> > > > > > > > retention.ms=86400000
> > >> > > > > > > >     Topic: mytopic    Partition: 0 
  Leader: -1
> > >> Replicas:
> > >> > > 30,10
> > >> > > > > > > Isr:
> > >> > > > > > > >     Topic: mytopic    Partition: 1 
  Leader: -1
> > >> Replicas:
> > >> > > 10,20
> > >> > > > > > > Isr:
> > >> > > > > > > >     Topic: mytopic    Partition: 2 
  Leader: -1
> > >> Replicas:
> > >> > > 20,30
> > >> > > > > > > Isr:
> > >> > > > > > > >
> > >> > > > > > > > However the applications using it are
running normally,
> > >> > consumers
> > >> > > > as
> > >> > > > > > well
> > >> > > > > > > > as producers.
> > >> > > > > > > > The logs are not showing errors or weird
messages with
> the
> > >> > > > exceptions
> > >> > > > > > of
> > >> > > > > > > > some
> > >> > > > > > > > Failed to rename [/var/log/kafka/log-cleaner.log]
to
> > >> > > > > > > > [/var/log/kafka/log-cleaner.log.2017-05-31-17]
> > >> > > > > > > > which appear each few days.
> > >> > > > > > > >
> > >> > > > > > > > Now I would like to bring back to cluster
to a good
> state.
> > >> I'm
> > >> > > > afraid
> > >> > > > > > of
> > >> > > > > > > > restarting brokers because all of them
are supposed to
> be
> > >> > leaders
> > >> > > > for
> > >> > > > > > > some
> > >> > > > > > > > partitions, so if I restart them and
there's no leader I
> > >> might
> > >> > > > > > experience
> > >> > > > > > > > data loss.
> > >> > > > > > > >
> > >> > > > > > > > Have you face any similar situation?
Can someone give
> me a
> > >> > hint?
> > >> > > > > > > >
> > >> > > > > > > > Thanks in advance,
> > >> > > > > > > > Alberto.
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > --
> > >> > > > > > News, jobs, product releases, events.
> > >> > > > > > Follow 360dialog on LinkedIn <https://www.linkedin.com/
> > >> > > > company/360dialog
> > >> > > > > >
> > >> > > > > >  and Twitter <https://twitter.com/360dialog>.
> > >> > > > > > Subscribe to our newsletter <http://www.360dialog.com/news
> > >> letter/
> > >> > >.
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > *Alberto del Barrio*DevOps Engineer
> > >> > > > > >
> > >> > > > > > <http://www.360dialog.com?utm_campaign=email-signature&utm_
> > >> > > > > > content=-&utm_medium=email&utm_source=signature&utm_term=
> > logo>
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > *Contact 360dialog*www.360dialog.com
> > >> > > > > > <http://www.360dialog.com/?utm_campaign=email-signature&
> > >> > > > > > utm_content=-&utm_medium=email&utm_source=signature&utm_
> > >> term=url>
> > >> > > > > > info@360dialog.com
> > >> > > > > > +49-(0)30-6098-5953-0
> > >> > > > > >
> > >> > > > > > 360dialog GmbH, Saarbrücker Str. 36-38, 10405
Berlin,
> Germany
> > >> > > > > > Managing director: Roland Siebert
> > >> > > > > > Commercial register: Charlottenbug, HRB 144188
B
> > >> > > > > > VAT ID: DE815382679
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > News, jobs, product releases, events.
> > >> > > > Follow 360dialog on LinkedIn <https://www.linkedin.com/
> > >> > company/360dialog
> > >> > > >
> > >> > > >  and Twitter <https://twitter.com/360dialog>.
> > >> > > > Subscribe to our newsletter <http://www.360dialog.com/
> newsletter/
> > >.
> > >> > > >
> > >> > > >
> > >> > > > *Alberto del Barrio*DevOps Engineer
> > >> > > >
> > >> > > > <http://www.360dialog.com?utm_campaign=email-signature&utm_
> > >> > > > content=-&utm_medium=email&utm_source=signature&utm_term=logo>
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > *Contact 360dialog*www.360dialog.com
> > >> > > > <http://www.360dialog.com/?utm_campaign=email-signature&
> > >> > > > utm_content=-&utm_medium=email&utm_source=signature&
> utm_term=url>
> > >> > > > info@360dialog.com
> > >> > > > +49-(0)30-6098-5953-0
> > >> > > >
> > >> > > > 360dialog GmbH, Saarbrücker Str. 36-38, 10405 Berlin, Germany
> > >> > > > Managing director: Roland Siebert
> > >> > > > Commercial register: Charlottenbug, HRB 144188 B
> > >> > > > VAT ID: DE815382679
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > News, jobs, product releases, events.
> > >> > Follow 360dialog on LinkedIn <https://www.linkedin.com/comp
> > >> any/360dialog>
> > >> >  and Twitter <https://twitter.com/360dialog>.
> > >> > Subscribe to our newsletter <http://www.360dialog.com/newsletter/>.
> > >> >
> > >> >
> > >> > *Alberto del Barrio*DevOps Engineer
> > >> >
> > >> > <http://www.360dialog.com?utm_campaign=email-signature&utm_
> > >> > content=-&utm_medium=email&utm_source=signature&utm_term=logo>
> > >> >
> > >> >
> > >> >
> > >> > *Contact 360dialog*www.360dialog.com
> > >> > <http://www.360dialog.com/?utm_campaign=email-signature&
> > >> > utm_content=-&utm_medium=email&utm_source=signature&utm_term=url>
> > >> > info@360dialog.com
> > >> > +49-(0)30-6098-5953-0
> > >> >
> > >> > 360dialog GmbH, Saarbrücker Str. 36-38, 10405 Berlin, Germany
> > >> > Managing director: Roland Siebert
> > >> > Commercial register: Charlottenbug, HRB 144188 B
> > >> > VAT ID: DE815382679
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > News, jobs, product releases, events.
> > > Follow 360dialog on LinkedIn <https://www.linkedin.com/
> company/360dialog
> > >
> > >  and Twitter <https://twitter.com/360dialog>.
> > > Subscribe to our newsletter <http://www.360dialog.com/newsletter/>.
> > >
> > >
> > > *Alberto del Barrio*DevOps Engineer
> > >
> > >
> > > <http://www.360dialog.com?utm_campaign=email-signature&utm_
> > content=-&utm_medium=email&utm_source=signature&utm_term=logo>
> > >
> > >
> > >
> > > *Contact 360dialog*www.360dialog.com
> > > <http://www.360dialog.com/?utm_campaign=email-signature&
> > utm_content=-&utm_medium=email&utm_source=signature&utm_term=url>
> > > info@360dialog.com
> > > +49-(0)30-6098-5953-0
> > >
> > > 360dialog GmbH, Saarbrücker Str. 36-38, 10405 Berlin, Germany
> > > Managing director: Roland Siebert
> > > Commercial register: Charlottenbug, HRB 144188 B
> > > VAT ID: DE815382679
> > >
> >
> >
> >
> > --
> > News, jobs, product releases, events.
> > Follow 360dialog on LinkedIn <https://www.linkedin.com/company/360dialog
> >
> >  and Twitter <https://twitter.com/360dialog>.
> > Subscribe to our newsletter <http://www.360dialog.com/newsletter/>.
> >
> >
> > *Alberto del Barrio*DevOps Engineer
> >
> > <http://www.360dialog.com?utm_campaign=email-signature&utm_
> > content=-&utm_medium=email&utm_source=signature&utm_term=logo>
> >
> >
> >
> > *Contact 360dialog*www.360dialog.com
> > <http://www.360dialog.com/?utm_campaign=email-signature&
> > utm_content=-&utm_medium=email&utm_source=signature&utm_term=url>
> > info@360dialog.com
> > +49-(0)30-6098-5953-0
> >
> > 360dialog GmbH, Saarbrücker Str. 36-38, 10405 Berlin, Germany
> > Managing director: Roland Siebert
> > Commercial register: Charlottenbug, HRB 144188 B
> > VAT ID: DE815382679
> >
>



-- 
News, jobs, product releases, events.
Follow 360dialog on LinkedIn <https://www.linkedin.com/company/360dialog>
 and Twitter <https://twitter.com/360dialog>.
Subscribe to our newsletter <http://www.360dialog.com/newsletter/>.


*Alberto del Barrio*DevOps Engineer

<http://www.360dialog.com?utm_campaign=email-signature&utm_content=-&utm_medium=email&utm_source=signature&utm_term=logo>



*Contact 360dialog*www.360dialog.com
<http://www.360dialog.com/?utm_campaign=email-signature&utm_content=-&utm_medium=email&utm_source=signature&utm_term=url>
info@360dialog.com
+49-(0)30-6098-5953-0

360dialog GmbH, Saarbrücker Str. 36-38, 10405 Berlin, Germany
Managing director: Roland Siebert
Commercial register: Charlottenbug, HRB 144188 B
VAT ID: DE815382679

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message