kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michal Michalski <michal.michal...@boxever.com>
Subject Re: Consumer offset is getting reset back to some old value automatically
Date Wed, 25 Jun 2014 09:08:34 GMT
My understanding is that "bringing down 1 node our of a 3 node zookeeper
cluster is risky,
since any subsequent leader election *might* not reach a quorum"and "It is
less likely but still risky to some
extent" - *"it might not reach a quorum"*, because you need both of the
remaining nodes to be up to reach quorum (of course it will be still
possible, but it *might* fail). In case of a 5-node cluster having 1 node
down is not that risky, because you still have 4 nodes and you need only 3
of them to reach quorum.

M.

Kind regards,
MichaƂ Michalski,
michal.michalski@boxever.com


On 25 June 2014 09:59, Kane Kane <kane.isturm@gmail.com> wrote:

> Neha, thanks for the answer, I want to understand what is the case when:
> >>Also, bringing down 1 node our of a 3 node zookeeper cluster is risky,
> since any subsequent leader election might not reach a quorum
>
> I was thinking zookeeper guarantees quorum if only 1 node out of 3 fails?
>
> Thanks.
>
> On Tue, Jun 24, 2014 at 3:30 PM, Neha Narkhede <neha.narkhede@gmail.com>
> wrote:
> > See the explanation from the zookeeper folks here
> > <https://zookeeper.apache.org/doc/r3.3.2/zookeeperAdmin.html> -
> >
> > " Because Zookeeper requires a majority, it is best to use an odd number
> of
> > machines. For example, with four machines ZooKeeper can only handle the
> > failure of a single machine; if two machines fail, the remaining two
> > machines do not constitute a majority. However, with five machines
> > ZooKeeper can handle the failure of two machines."
> >
> > Hope that helps.
> >
> >
> >
> > On Tue, Jun 24, 2014 at 12:36 PM, Kane Kane <kane.isturm@gmail.com>
> wrote:
> >
> >> Sorry, i meant 5 nodes in previous question.
> >>
> >> On Tue, Jun 24, 2014 at 12:36 PM, Kane Kane <kane.isturm@gmail.com>
> wrote:
> >> > Hello Neha,
> >> >
> >> >>>ZK cluster of 3 nodes will tolerate the loss of 1 node, but if there
> is
> >> a
> >> > subsequent leader election for any reason, there is a chance that the
> >> > cluster does not reach a quorum. It is less likely but still risky to
> >> some
> >> > extent.
> >> >
> >> > Does it mean if you have to tolerate 1 node loss without any issues,
> >> > you need *at least* 4 nodes?
> >> >
> >> > On Tue, Jun 24, 2014 at 11:16 AM, Neha Narkhede <
> neha.narkhede@gmail.com>
> >> wrote:
> >> >> Can you elaborate your notion of "smooth"? I thought if you have
> >> >> replication factor=3 in this case, you should be able to tolerate
> loss
> >> >> of a node?
> >> >>
> >> >> Yes, you should be able to tolerate the loss of a node but if
> controlled
> >> >> shutdown is not enabled, the delay between loss of the old leader and
> >> >> election of the new leader will be longer.
> >> >>
> >> >> So, you mean ZK cluster of 3 nodes can't tolerate 1 node loss? I've
> >> >> seen many recommendations to run 3-nodes cluster, does it mean in
> >> >> cluster of 3 you won't be able to operate after loosing 1 node?
> >> >>
> >> >> ZK cluster of 3 nodes will tolerate the loss of 1 node, but if there
> is
> >> a
> >> >> subsequent leader election for any reason, there is a chance that the
> >> >> cluster does not reach a quorum. It is less likely but still risky
to
> >> some
> >> >> extent.
> >> >>
> >> >>
> >> >> On Tue, Jun 24, 2014 at 2:44 AM, Hemath Kumar <
> hksrckmurthy@gmail.com>
> >> >> wrote:
> >> >>
> >> >>> Yes kane i have the replication factor configured as 3
> >> >>>
> >> >>>
> >> >>> On Tue, Jun 24, 2014 at 2:42 AM, Kane Kane <kane.isturm@gmail.com>
> >> wrote:
> >> >>>
> >> >>> > Hello Neha, can you explain your statements:
> >> >>> > >>Bringing one node down in a cluster will go smoothly
only if
> your
> >> >>> > replication factor is 1 and you enabled controlled shutdown
on the
> >> >>> brokers.
> >> >>> >
> >> >>> > Can you elaborate your notion of "smooth"? I thought if you
have
> >> >>> > replication factor=3 in this case, you should be able to tolerate
> >> loss
> >> >>> > of a node?
> >> >>> >
> >> >>> > >>Also, bringing down 1 node our of a 3 node zookeeper
cluster is
> >> risky,
> >> >>> > since any subsequent leader election might not reach a quorum.
> >> >>> >
> >> >>> > So, you mean ZK cluster of 3 nodes can't tolerate 1 node loss?
> I've
> >> >>> > seen many recommendations to run 3-nodes cluster, does it
mean in
> >> >>> > cluster of 3 you won't be able to operate after loosing 1
node?
> >> >>> >
> >> >>> > Thanks.
> >> >>> >
> >> >>> > On Mon, Jun 23, 2014 at 9:04 AM, Neha Narkhede <
> >> neha.narkhede@gmail.com>
> >> >>> > wrote:
> >> >>> > > Bringing one node down in a cluster will go smoothly
only if
> your
> >> >>> > > replication factor is 1 and you enabled controlled shutdown
on
> the
> >> >>> > brokers.
> >> >>> > > Also, bringing down 1 node our of a 3 node zookeeper
cluster is
> >> risky,
> >> >>> > > since any subsequent leader election might not reach
a quorum.
> >> Having
> >> >>> > said
> >> >>> > > that, a partition going offline shouldn't cause a consumer's
> >> offset to
> >> >>> > > reset to an old value. How did you find out what the
consumer's
> >> offset
> >> >>> > was?
> >> >>> > > Do you have your consumer's logs around?
> >> >>> > >
> >> >>> > > Thanks,
> >> >>> > > Neha
> >> >>> > >
> >> >>> > >
> >> >>> > > On Mon, Jun 23, 2014 at 12:28 AM, Hemath Kumar <
> >> hksrckmurthy@gmail.com
> >> >>> >
> >> >>> > > wrote:
> >> >>> > >
> >> >>> > >> We have a 3 node cluster ( 3 kafka + 3 ZK nodes )
. Recently we
> >> came
> >> >>> > across
> >> >>> > >> a strange issue where we wanted to bring one of the
node down
> from
> >> >>> > cluster
> >> >>> > >> ( 1 kafka + 1 zookeeper) for doing a maintenance
. But the
> >> movement we
> >> >>> > >> brought it to down on some of the topics ( only some
> partitions)
> >> >>> > consumers
> >> >>> > >> offset is reset some old value.
> >> >>> > >>
> >> >>> > >> Any reason why this is happened?. As of my knowledge
when
> brought
> >> one
> >> >>> > node
> >> >>> > >> down its should work smoothly with out any impact.
> >> >>> > >>
> >> >>> > >> Thanks,
> >> >>> > >> Murthy Chelankuri
> >> >>> > >>
> >> >>> >
> >> >>>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message