OK, thanks again re the hints for ZK and how to launch submit_plan. Now I’ve got a 'java.net.SocketException:
Network is unreachable’
Background: I’ve three Drillbits running all connected to ZK:
[zk: 127.0.0.1:2181(CONNECTED) 4] ls /drill/drillbits1
[d2e9c990-1607-48f8-8d99-4a209b312a43, 17bf46c9-23f2-42cc-8d25-cc42b7a599f0, 146c8df4-a62c-41b8-af1f-0f7551867d84]
When I then submit a physical plan:
$ bin/submit_plan -f sample-data/physical_json_scan_test1.json -t physical -zk 127.0.0.1:2181
I get:
[[
Exception in thread "main" org.apache.drill.exec.rpc.RpcException: Failure connecting to server.
Failure of type CONNECTION.
at org.apache.drill.exec.client.DrillClient$FutureHandler.connectionFailed(DrillClient.java:246)
at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:155)
at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:141)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:621)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:548)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:407)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:217)
at io.netty.channel.DefaultChannelPipeline$HeadHandler.connect(DefaultChannelPipeline.java:1008)
at io.netty.channel.DefaultChannelHandlerContext.invokeConnect(DefaultChannelHandlerContext.java:491)
at io.netty.channel.DefaultChannelHandlerContext.connect(DefaultChannelHandlerContext.java:476)
at io.netty.channel.ChannelOutboundHandlerAdapter.connect(ChannelOutboundHandlerAdapter.java:47)
at io.netty.channel.DefaultChannelHandlerContext.invokeConnect(DefaultChannelHandlerContext.java:491)
at io.netty.channel.DefaultChannelHandlerContext.connect(DefaultChannelHandlerContext.java:476)
at io.netty.channel.DefaultChannelHandlerContext.connect(DefaultChannelHandlerContext.java:461)
at io.netty.channel.DefaultChannelPipeline.connect(DefaultChannelPipeline.java:847)
at io.netty.channel.AbstractChannel.connect(AbstractChannel.java:198)
at io.netty.bootstrap.Bootstrap$2.run(Bootstrap.java:165)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:354)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:366)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.util.concurrent.ExecutionException: java.net.SocketException: Network is unreachable
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:147)
... 19 more
Caused by: java.net.SocketException: Network is unreachable
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:364)
at sun.nio.ch.Net.connect(Net.java:356)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623)
at io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:195)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:172)
... 14 more
]]
Thoughts?
Cheers,
Michael
--
Michael Hausenblas
Ireland, Europe
http://mhausenblas.info/
On 27 Oct 2013, at 22:48, Steven Phillips <sphillips@maprtech.com> wrote:
> Actually, I am wrong, Drill does not start a zookeeper when running in
> local mode. The LocalClusterCoordinator does not use zookeeper at all.
>
>
> On Sun, Oct 27, 2013 at 3:44 PM, Steven Phillips <sphillips@maprtech.com>wrote:
>
>> Drill will start a zookeeper only in embedded mode. For example, running
>> sqlline using parquet-local will launch a drillbit and zk all within one
>> JVM.
>>
>> But to run a standalone drillbit requires an external zookeeper.
>>
>>
>> On Sun, Oct 27, 2013 at 3:39 PM, Michael Hausenblas <
>> michael.hausenblas@gmail.com> wrote:
>>
>>>
>>> Maybe I'm dense but I thought Drill starts a ZK? Or do I have to install
>>> and launch ZK separately?
>>>
>>> I'm using the binary version of M1. Run all things local only on my
>>> laptop ...
>>>
>>> Cheers,
>>> Michael
>>>
>>> Sent from my iPad
>>>
>>> --
>>> Michael Hausenblas, http://mhausenblas.info
>>>
>>>> On 27 Oct 2013, at 22:17, Steven Phillips <sphillips@maprtech.com>
>>> wrote:
>>>>
>>>> You need to replace localhost with the hostname of the node running
>>>> zookeeper. If that zookeeper is configured to use a port different than
>>>> 2181, then that needs to be set as well. If you have multiple
>>> zookeepers in
>>>> the quorum, you then zk.connect should be a comma separated list of the
>>>> host:port of each node.
>>>>
>>>> The default, localhost setting will only work when a drillbit is
>>> running on
>>>> the same node as the zookeeper.
>>>>
>>>>
>>>> On Sun, Oct 27, 2013 at 2:57 PM, Michael Hausenblas <
>>>> michael.hausenblas@gmail.com> wrote:
>>>>
>>>>>
>>>>>> One thing to add to the diagram is that all of the drill java
>>> processes
>>>>> will look at what is in drill-override.conf.
>>>>>
>>>>> Thanks, done.
>>>>>
>>>>>
>>>>>> You must set zk.connect to the correct zk host:port.
>>>>>
>>>>>
>>>>> Can you be a tad more explicit, please? In drill-override.conf I have
>>>>>
>>>>> [[
>>>>> …
>>>>> zk: {
>>>>> connect: "localhost:2181”,
>>>>> …
>>>>> ]]
>>>>>
>>>>>
>>>>> What am I overlooking?
>>>>>
>>>>> Also, any directions re the rest of my questions (re bin/submit_plan
>>> etc.)?
>>>>>
>>>>>
>>>>> With a little help from here, I’m happy to put together the
>>> description
>>>>> how to set this up in the Wiki, also to address a query we’ve now lying
>>>>> around for more than three weeks, by Steve McPherson – see
>>>>>
>>> http://mail-archives.apache.org/mod_mbox/incubator-drill-user/201310.mbox/%3CCE71A20F.14F5B%25stevemp%40amazon.com%3E–<http://mail-archives.apache.org/mod_mbox/incubator-drill-user/201310.mbox/%3CCE71A20F.14F5B%25stevemp%40amazon.com%3E%E2%80%93>the
fact that it attracted 0 responses I find slightly embarrassing, and
>>>>> if I were Steve, I’d prolly not touch Drill anymore, but let’s hope
>>> for the
>>>>> best …
>>>>>
>>>>>
>>>>> Cheers,
>>>>> Michael
>>>>>
>>>>> --
>>>>> Michael Hausenblas
>>>>> Ireland, Europe
>>>>> http://mhausenblas.info/
>>>>>
>>>>>> On 27 Oct 2013, at 21:32, Steven Phillips <sphillips@maprtech.com>
>>> wrote:
>>>>>>
>>>>>> One thing to add to the diagram is that all of the drill java
>>> processes
>>>>>> will look at what is in drill-override.conf. You must set zk.connect
>>> to
>>>>> the
>>>>>> correct zk host:port.
>>>>>>
>>>>>>
>>>>>> On Sun, Oct 27, 2013 at 2:00 PM, Michael Hausenblas <
>>>>>> michael.hausenblas@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> Folks,
>>>>>>>
>>>>>>> I’m trying to set up Drill in distributed mode. Here’s what
I have so
>>>>> far:
>>>>>>> when I launch the first Drillbit with bin/drillbit.sh I get the
>>>>> following
>>>>>>> in log/drillbit.out:
>>>>>>>
>>>>>>> [[
>>>>>>> 20:47:20.963 [main] ERROR com.netflix.curator.ConnectionState
-
>>>>> Connection
>>>>>>> timed out for connection string (localhost:2181) and timeout
(5000) /
>>>>>>> elapsed (5045)
>>>>>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>>>> KeeperErrorCode = ConnectionLoss
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.ConnectionState.getZooKeeper(ConnectionState.java:94)
>>>>>>> ~[curator-client-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:106)
>>>>>>> [curator-client-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:393)
>>>>>>> [curator-framework-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:184)
>>>>>>> [curator-framework-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:173)
>>>>>>> [curator-framework-1.1.9.jar:na]
>>>>>>> at
>>> com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:85)
>>>>>>> [curator-client-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:169)
>>>>>>> [curator-framework-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:161)
>>>>>>> [curator-framework-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:36)
>>>>>>> [curator-framework-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.x.discovery.details.ServiceDiscoveryImpl.getChildrenWatched(ServiceDiscoveryImpl.java:306)
>>>>>>> [curator-x-discovery-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.x.discovery.details.ServiceDiscoveryImpl.queryForInstances(ServiceDiscoveryImpl.java:276)
>>>>>>> [curator-x-discovery-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.x.discovery.details.ServiceCache.refresh(ServiceCache.java:193)
>>>>>>> [curator-x-discovery-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> com.netflix.curator.x.discovery.details.ServiceCache.start(ServiceCache.java:116)
>>>>>>> [curator-x-discovery-1.1.9.jar:na]
>>>>>>> at
>>>>>>>
>>>>>
>>> org.apache.drill.exec.coord.ZKClusterCoordinator.start(ZKClusterCoordinator.java:89)
>>>>>>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
>>>>>>> at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:94)
>>>>>>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
>>>>>>> at
>>> org.apache.drill.exec.server.Drillbit.start(Drillbit.java:56)
>>>>>>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
>>>>>>> at
>>> org.apache.drill.exec.server.Drillbit.start(Drillbit.java:43)
>>>>>>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
>>>>>>> at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:65)
>>>>>>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
>>>>>>> ]]
>>>>>>>
>>>>>>> This seems to be a known issue? See
>>>>>>>
>>>>>
>>> http://stackoverflow.com/questions/16056751/curator-zookeeper-client-keeps-throw-out-connectionlossexception-per-connection
>>>>>>>
>>>>>>> Any ideas? Did anyone actually run Drill in distributed mode
already
>>> and
>>>>>>> if so, how did you overcome the above issue?
>>>>>>>
>>>>>>> What is next? How do I make other Drillbits point to the same
ZK
>>>>> cluster?
>>>>>>> And has anyone an example of the call parameters for bin/submit_plan
>>>>> maybe
>>>>>>> as well?
>>>>>>>
>>>>>>>
>>>>>>> BTW, in the process of trying to figure what’s going on behind
the
>>>>> scene I
>>>>>>> traced down the startup call dependencies (scripts), available
via:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>> https://docs.google.com/drawings/d/1-ADIGJ-lBr-dOrOjMpQlProiZjYjjuM0kR6A81BYwKA/edit?usp=sharing
>>>>>>>
>>>>>>> which we could then also use for documentation purposes.
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Michael
>>>>>>>
>>>>>>> --
>>>>>>> Michael Hausenblas
>>>>>>> Ireland, Europe
>>>>>>> http://mhausenblas.info/
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>
>>
|