storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From S G <sg.online.em...@gmail.com>
Subject Re: UnknownHostException when a Zookeeper instance goes down on AWS
Date Thu, 23 Feb 2017 16:48:37 GMT
There is a jira for upgrading zk in storm.
https://issues.apache.org/jira/browse/STORM-2290

You can add your notes to the same.

-SG


On Thu, Feb 23, 2017 at 8:32 AM, S G <sg.online.email@gmail.com> wrote:

> Thanks Anthony,
>
> This is a serious issue with zookeeper.
>
> Did you try asking in the zookeeper forums too why they are still in Alpha?
> It might be worth knowing what is causing them to not release a non-alpha
> version?
>
> Thanks
> SG
>
> On Wed, Feb 22, 2017 at 6:03 AM, Anthony Milbourne <
> anthony.milbourne@mporium.com> wrote:
>
>> Hi,
>>
>>
>>
>> We run a storm cluster (v.1.0.2) on AWS and have 3 Zookeepers supporting
>> it.  Because AWS sometimes terminates VMs, we sometimes lose a Zookeeper
>> instance.  When this happens, the hostname cannot be resolved for that
>> zookeeper instance as AWS has taken the VM away.  We noticed that in this
>> case storm fails to connect to zookeeper – even though there are still 2
>> Zookeeper instances running.  It fails with an exception something like:
>>
>>
>>
>> java.net.UnknownHostException: zookeeper3
>>
>>   at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
>>
>>   at java.net.InetAddress.getAllByName(InetAddress.java:1192)
>>
>>   at java.net.InetAddress.getAllByName(InetAddress.java:1126)
>>
>>   at org.apache.storm.shade.org.apache.zookeeper.client.StaticHos
>> tProvider.<init>(StaticHostProvider.java:61)
>>
>>   at org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
>>
>>
>>   at org.apache.storm.shade.org.apache.curator.utils.DefaultZooke
>> eperFactory.newZooKeeper(DefaultZookeeperFactory.java:29)
>>
>>   at org.apache.storm.shade.org.apache.curator.framework.imps.Cur
>> atorFrameworkImpl$2.newZooKeeper(CuratorFrameworkImpl.java:150)
>>
>>   at org.apache.storm.shade.org.apache.curator.HandleHolder$1.get
>> ZooKeeper(HandleHolder.java:94)
>>
>>   at org.apache.storm.shade.org.apache.curator.HandleHolder.getZooKeeper(HandleHolder.java:55)
>>
>>
>>   at org.apache.storm.shade.org.apache.curator.ConnectionState.
>> reset(ConnectionState.java:218)
>>
>>   at org.apache.storm.shade.org.apache.curator.ConnectionState.
>> start(ConnectionState.java:103)
>>
>>   at org.apache.storm.shade.org.apache.curator.CuratorZookeeperCl
>> ient.start(CuratorZookeeperClient.java:190)
>>
>>   at org.apache.storm.shade.org.apache.curator.framework.imps.Cur
>> atorFrameworkImpl.start(CuratorFrameworkImpl.java:259)
>>
>>   at org.apache.storm.zookeeper$mk_client.doInvoke(zookeeper.clj:86)
>>
>>   at clojure.lang.RestFn.invoke(RestFn.java:494)
>>
>>  at org.apache.storm.cluster_state.zookeeper_state_factory$_
>> mkState.invoke(zookeeper_state_factory.clj:28)
>>
>>   at org.apache.storm.cluster_state.zookeeper_state_factory.mkState(Unknown
>> Source)
>>
>>   <SNIP REST OF STACKTRACE>
>>
>>
>>
>> Having done some research it looks like this error is caused by a bug in
>> the Zookeeper client library.  There is an issue for it here:
>>
>> https://issues.apache.org/jira/browse/ZOOKEEPER-1576
>>
>> This issue has been resolved in the version 3.5.x branch of Zookeeper.
>> However, after 2.5 years and 3 releases the 3.5.x branch of Zookeeper is
>> still in Alpha L.
>>
>>
>>
>> Despite the fact that it is in alpha, there is a branch of Curator
>> (v.3.x.x) that uses it, but Storm uses Curator version 2.x.x – possibly
>> because it doesn’t rely on alpha code.
>>
>> So the bug is still unpatched in Storm
>>
>>
>>
>> Does anyone have experience of this issue?
>>
>> Can anyone offer any ideas for workarounds?
>>
>>
>>
>> Thanks,
>>
>>
>>
>>      Anthony.
>> Anthony Milbourne
>> *anthony.milbourne@mporium.com* <anthony.milbourne@mporium.com>
>> *mporium.com* <https://mporium.com/>
>> [image: LinkdIn] <https://www.linkedin.com/company/mporium>
>> [image: Facebook] <https://www.facebook.com/mporiumgroup>
>> [image: Twitter] <https://twitter.com/mporiumgroup>
>> mporium Group Plc, registered in England and Wales - First F
>> loor, 106 New Bond Street, London, W1S 1DN
>> We're hiring -
>> *join the mporium team* <https://mporium.com/careers>
>>
>
>

Mime
View raw message