storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Weathers <eweath...@groupon.com>
Subject Re: How could one debug the root cause of this error?
Date Sat, 14 Jan 2017 20:19:27 GMT
On Fri, Jan 13, 2017 at 8:56 PM, Joaquin Menchaca <jmenchaca@gobalto.com>
wrote:

> I bounce everything across the cluster and it fixed the problem.
> Zookeeper ocassionally has data in a broken state.  There is no data
> integrity check yet.
>
> I also found I ran out of space on Zookeeper as it is chatting and keeping
> gigabytes of archives. I turned that off.
>
> One time when i upgraded from 0.9 to 1.0, zk data was so mess up, lots of
> crashes.
>

For completeness, that's an unfortunate but expected behavior, because
Storm stores lots of serialized objects into ZooKeeper, and the 0.9 to 1.0
change included backwards-incompatible changes that broke the
deserialization.  The most pervasive of those changes was the package path
change from "backtype.*" to "org.apache.*", but there might have been
others.   I agree that it would be nice if there was some validation to
decide whether state should be rejected.

- Erik


> I blasted manually (rm -rf) all zk data, and that fixed things up.
>
> On Dec 22, 2016 4:37 PM, "Hugo Da Cruz Louro" <hlouro@hortonworks.com>
> wrote:
>
>> Is it doable for you to restart your zookeeper cluster? If possible, can
>> you do so, and then restart storm and deploy your storm topology again.
>>
>> On Dec 22, 2016, at 3:22 PM, Joaquin Menchaca <jmenchaca@gobalto.com>
>> wrote:
>>
>> Found nimbuses [] none of which is elected as leader, please try again after some
time
>>
>>
>>

Mime
View raw message