kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Huang <jason.hu...@icare.com>
Subject Re: Replicas for partition are dead
Date Fri, 22 Mar 2013 15:09:51 GMT
I see.

Since I am not running anything for production at this point - I will
probably just do this.

However, if someone runs 0.8.0 at production and they want to upgrade
to the latest version, how should they migrate their message? Maybe
there should be something documented in the wiki for this?

thanks,

Jason

On Fri, Mar 22, 2013 at 11:07 AM, Jun Rao <junrao@gmail.com> wrote:
> The easiest way is to wipe out both ZK and kafka data and start from
> scratch.
>
> Thanks,
>
> Jun
>
> On Fri, Mar 22, 2013 at 6:51 AM, Jason Huang <jason.huang@icare.com> wrote:
>
>> Thanks Jun.
>>
>> I have built the new kafka version and start the services. You
>> mentioned that ZK data structure has been changed - does that mean we
>> can't reload the previous messages from current log files? I actually
>> tried to copy the log files (.logs and .index) to the new kafka
>> instance but get the same "topic doesn't exist" error after running
>> the new kafka services.
>>
>> Any comments on how I might be able to recover previous messages?
>>
>> thanks
>>
>> Jason
>>
>> On Wed, Mar 20, 2013 at 12:15 PM, Jun Rao <junrao@gmail.com> wrote:
>> > The latest version of 0.8 can be found in the 0.8 branch, not trunk.
>> >
>> > Thanks,
>> >
>> > Jun
>> >
>> > On Wed, Mar 20, 2013 at 7:47 AM, Jason Huang <jason.huang@icare.com>
>> wrote:
>> >
>> >> The 0.8 version I use was built from trunk last Dec. Since then, this
>> >> error happened 3 times. Each time we had to remove all the ZK and
>> >> Kafka log data and restart the services.
>> >>
>> >> I will try newer versions with more recent patches and keep monitoring
>> it.
>> >>
>> >> thanks!
>> >>
>> >> Jason
>> >>
>> >> On Wed, Mar 20, 2013 at 10:39 AM, Jun Rao <junrao@gmail.com> wrote:
>> >> > Ok, so you are using the same broker id. What the error is saying is
>> that
>> >> > broker 1 doesn't seem to be up.
>> >> >
>> >> > Not sure what revision of 0.8 you are using. Could you try the latest
>> >> > revision in 0.8 and see if the problem still exists? You may have to
>> wipe
>> >> > out all ZK and Kafka data first since some ZK data structures have
>> been
>> >> > rename a few weeks ago.
>> >> >
>> >> > Thanks,
>> >> >
>> >> > Jun
>> >> >
>> >> > On Wed, Mar 20, 2013 at 6:57 AM, Jason Huang <jason.huang@icare.com>
>> >> wrote:
>> >> >
>> >> >> I restarted the zookeeper server first, then broker. It's the same
>> >> >> instance of kafka 0.8 and I am using the same config file. In
>> >> >> server.properties I have: brokerid=1
>> >> >>
>> >> >> Is that sufficient to ensure the broker get restarted with the
same
>> >> >> broker id as before?
>> >> >>
>> >> >> thanks,
>> >> >>
>> >> >> Jason
>> >> >>
>> >> >> On Wed, Mar 20, 2013 at 12:30 AM, Jun Rao <junrao@gmail.com>
wrote:
>> >> >> > Did the broker get restarted with the same broker id?
>> >> >> >
>> >> >> > Thanks,
>> >> >> >
>> >> >> > Jun
>> >> >> >
>> >> >> > On Tue, Mar 19, 2013 at 1:34 PM, Jason Huang <
>> jason.huang@icare.com>
>> >> >> wrote:
>> >> >> >
>> >> >> >> Hello,
>> >> >> >>
>> >> >> >> My kafka (0.8) server went down today for unknown reason
and when
>> I
>> >> >> >> restarted both zookeeper and kafka server I got the following
>> error
>> >> at
>> >> >> >> the kafka server log:
>> >> >> >>
>> >> >> >> [2013-03-19 13:39:16,131] INFO [Partition state machine
on
>> Controller
>> >> >> >> 1]: Invoking state change to OnlinePartition for partitions
>> >> >> >> (kafka.controller.PartitionStateMachine)
>> >> >> >> [2013-03-19 13:39:16,262] INFO [Partition state machine
on
>> Controller
>> >> >> >> 1]: Electing leader for partition
>> >> >> >> [topic_a937ac27-1883-4ca0-95bc-c9a740d08947, 0]
>> >> >> >> (kafka.controller.PartitionStateMachine)
>> >> >> >> [2013-03-19 13:39:16,451] ERROR [Partition state machine
on
>> >> Controller
>> >> >> >> 1]: State change for partition
>> >> >> >> [topic_a937ac27-1883-4ca0-95bc-c9a740d08947, 0] from
>> OfflinePartition
>> >> >> >> to OnlinePartition failed (kafka.controller.PartitionStateMachine)
>> >> >> >> kafka.common.PartitionOfflineException: All replicas for
partition
>> >> >> >> [topic_a937ac27-1883-4ca0-95bc-c9a740d08947, 0] are dead.
Marking
>> >> this
>> >> >> >> partition offline
>> >> >> >>         at
>> >> >> >>
>> >> >>
>> >>
>> kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:300)
>> >> >> >> .....
>> >> >> >> Caused by: kafka.common.PartitionOfflineException: No
replica for
>> >> >> >> partition ([topic_a937ac27-1883-4ca0-95bc-c9a740d08947,
0]) is
>> alive.
>> >> >> >> Live brokers are: [Set()], Assigned replicas are: [List(1)]
>> >> >> >> .......
>> >> >> >>
>> >> >> >> I am using one single server to host kafka and zookeeper.
>> Replication
>> >> >> >> factor is set to 1.
>> >> >> >>
>> >> >> >> This happened for all the existing topics. Not sure how
this
>> happened
>> >> >> >> but it appeared to be a bug. I did some search and the
only
>> possible
>> >> >> >> fix for this bug seems to be KAFKA-708.
>> >> >> >>
>> >> >> >> Any comments on this?  Thanks!
>> >> >> >>
>> >> >> >> Jason
>> >> >> >>
>> >> >>
>> >>
>>

Mime
View raw message