kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kane Kane <kane.ist...@gmail.com>
Subject Re: Consumer offset is getting reset back to some old value automatically
Date Wed, 25 Jun 2014 17:07:27 GMT
Michael, as I understand it's "risky" if another (2nd) node would
fail, in this case your zookeeper will be not operational, right? But
if no other issues it shouldn't affect operations.

On Wed, Jun 25, 2014 at 2:08 AM, Michal Michalski
<michal.michalski@boxever.com> wrote:
> My understanding is that "bringing down 1 node our of a 3 node zookeeper
> cluster is risky,
> since any subsequent leader election *might* not reach a quorum"and "It is
> less likely but still risky to some
> extent" - *"it might not reach a quorum"*, because you need both of the
> remaining nodes to be up to reach quorum (of course it will be still
> possible, but it *might* fail). In case of a 5-node cluster having 1 node
> down is not that risky, because you still have 4 nodes and you need only 3
> of them to reach quorum.
>
> M.
>
> Kind regards,
> MichaƂ Michalski,
> michal.michalski@boxever.com
>
>
> On 25 June 2014 09:59, Kane Kane <kane.isturm@gmail.com> wrote:
>
>> Neha, thanks for the answer, I want to understand what is the case when:
>> >>Also, bringing down 1 node our of a 3 node zookeeper cluster is risky,
>> since any subsequent leader election might not reach a quorum
>>
>> I was thinking zookeeper guarantees quorum if only 1 node out of 3 fails?
>>
>> Thanks.
>>
>> On Tue, Jun 24, 2014 at 3:30 PM, Neha Narkhede <neha.narkhede@gmail.com>
>> wrote:
>> > See the explanation from the zookeeper folks here
>> > <https://zookeeper.apache.org/doc/r3.3.2/zookeeperAdmin.html> -
>> >
>> > " Because Zookeeper requires a majority, it is best to use an odd number
>> of
>> > machines. For example, with four machines ZooKeeper can only handle the
>> > failure of a single machine; if two machines fail, the remaining two
>> > machines do not constitute a majority. However, with five machines
>> > ZooKeeper can handle the failure of two machines."
>> >
>> > Hope that helps.
>> >
>> >
>> >
>> > On Tue, Jun 24, 2014 at 12:36 PM, Kane Kane <kane.isturm@gmail.com>
>> wrote:
>> >
>> >> Sorry, i meant 5 nodes in previous question.
>> >>
>> >> On Tue, Jun 24, 2014 at 12:36 PM, Kane Kane <kane.isturm@gmail.com>
>> wrote:
>> >> > Hello Neha,
>> >> >
>> >> >>>ZK cluster of 3 nodes will tolerate the loss of 1 node, but
if there
>> is
>> >> a
>> >> > subsequent leader election for any reason, there is a chance that the
>> >> > cluster does not reach a quorum. It is less likely but still risky
to
>> >> some
>> >> > extent.
>> >> >
>> >> > Does it mean if you have to tolerate 1 node loss without any issues,
>> >> > you need *at least* 4 nodes?
>> >> >
>> >> > On Tue, Jun 24, 2014 at 11:16 AM, Neha Narkhede <
>> neha.narkhede@gmail.com>
>> >> wrote:
>> >> >> Can you elaborate your notion of "smooth"? I thought if you have
>> >> >> replication factor=3 in this case, you should be able to tolerate
>> loss
>> >> >> of a node?
>> >> >>
>> >> >> Yes, you should be able to tolerate the loss of a node but if
>> controlled
>> >> >> shutdown is not enabled, the delay between loss of the old leader
and
>> >> >> election of the new leader will be longer.
>> >> >>
>> >> >> So, you mean ZK cluster of 3 nodes can't tolerate 1 node loss?
I've
>> >> >> seen many recommendations to run 3-nodes cluster, does it mean
in
>> >> >> cluster of 3 you won't be able to operate after loosing 1 node?
>> >> >>
>> >> >> ZK cluster of 3 nodes will tolerate the loss of 1 node, but if
there
>> is
>> >> a
>> >> >> subsequent leader election for any reason, there is a chance that
the
>> >> >> cluster does not reach a quorum. It is less likely but still risky
to
>> >> some
>> >> >> extent.
>> >> >>
>> >> >>
>> >> >> On Tue, Jun 24, 2014 at 2:44 AM, Hemath Kumar <
>> hksrckmurthy@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >>> Yes kane i have the replication factor configured as 3
>> >> >>>
>> >> >>>
>> >> >>> On Tue, Jun 24, 2014 at 2:42 AM, Kane Kane <kane.isturm@gmail.com>
>> >> wrote:
>> >> >>>
>> >> >>> > Hello Neha, can you explain your statements:
>> >> >>> > >>Bringing one node down in a cluster will go smoothly
only if
>> your
>> >> >>> > replication factor is 1 and you enabled controlled shutdown
on the
>> >> >>> brokers.
>> >> >>> >
>> >> >>> > Can you elaborate your notion of "smooth"? I thought if
you have
>> >> >>> > replication factor=3 in this case, you should be able
to tolerate
>> >> loss
>> >> >>> > of a node?
>> >> >>> >
>> >> >>> > >>Also, bringing down 1 node our of a 3 node zookeeper
cluster is
>> >> risky,
>> >> >>> > since any subsequent leader election might not reach a
quorum.
>> >> >>> >
>> >> >>> > So, you mean ZK cluster of 3 nodes can't tolerate 1 node
loss?
>> I've
>> >> >>> > seen many recommendations to run 3-nodes cluster, does
it mean in
>> >> >>> > cluster of 3 you won't be able to operate after loosing
1 node?
>> >> >>> >
>> >> >>> > Thanks.
>> >> >>> >
>> >> >>> > On Mon, Jun 23, 2014 at 9:04 AM, Neha Narkhede <
>> >> neha.narkhede@gmail.com>
>> >> >>> > wrote:
>> >> >>> > > Bringing one node down in a cluster will go smoothly
only if
>> your
>> >> >>> > > replication factor is 1 and you enabled controlled
shutdown on
>> the
>> >> >>> > brokers.
>> >> >>> > > Also, bringing down 1 node our of a 3 node zookeeper
cluster is
>> >> risky,
>> >> >>> > > since any subsequent leader election might not reach
a quorum.
>> >> Having
>> >> >>> > said
>> >> >>> > > that, a partition going offline shouldn't cause a
consumer's
>> >> offset to
>> >> >>> > > reset to an old value. How did you find out what
the consumer's
>> >> offset
>> >> >>> > was?
>> >> >>> > > Do you have your consumer's logs around?
>> >> >>> > >
>> >> >>> > > Thanks,
>> >> >>> > > Neha
>> >> >>> > >
>> >> >>> > >
>> >> >>> > > On Mon, Jun 23, 2014 at 12:28 AM, Hemath Kumar <
>> >> hksrckmurthy@gmail.com
>> >> >>> >
>> >> >>> > > wrote:
>> >> >>> > >
>> >> >>> > >> We have a 3 node cluster ( 3 kafka + 3 ZK nodes
) . Recently we
>> >> came
>> >> >>> > across
>> >> >>> > >> a strange issue where we wanted to bring one
of the node down
>> from
>> >> >>> > cluster
>> >> >>> > >> ( 1 kafka + 1 zookeeper) for doing a maintenance
. But the
>> >> movement we
>> >> >>> > >> brought it to down on some of the topics ( only
some
>> partitions)
>> >> >>> > consumers
>> >> >>> > >> offset is reset some old value.
>> >> >>> > >>
>> >> >>> > >> Any reason why this is happened?. As of my knowledge
when
>> brought
>> >> one
>> >> >>> > node
>> >> >>> > >> down its should work smoothly with out any impact.
>> >> >>> > >>
>> >> >>> > >> Thanks,
>> >> >>> > >> Murthy Chelankuri
>> >> >>> > >>
>> >> >>> >
>> >> >>>
>> >>
>>

Mime
View raw message