OK, thanks again re the hints for ZK and how to launch submit_plan. Now I’ve got a 'java.net.SocketException: Network is unreachable’ Background: I’ve three Drillbits running all connected to ZK: [zk: 127.0.0.1:2181(CONNECTED) 4] ls /drill/drillbits1 [d2e9c990-1607-48f8-8d99-4a209b312a43, 17bf46c9-23f2-42cc-8d25-cc42b7a599f0, 146c8df4-a62c-41b8-af1f-0f7551867d84] When I then submit a physical plan: $ bin/submit_plan -f sample-data/physical_json_scan_test1.json -t physical -zk 127.0.0.1:2181 I get: [[ Exception in thread "main" org.apache.drill.exec.rpc.RpcException: Failure connecting to server. Failure of type CONNECTION. at org.apache.drill.exec.client.DrillClient$FutureHandler.connectionFailed(DrillClient.java:246) at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:155) at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:141) at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:621) at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:548) at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:407) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:217) at io.netty.channel.DefaultChannelPipeline$HeadHandler.connect(DefaultChannelPipeline.java:1008) at io.netty.channel.DefaultChannelHandlerContext.invokeConnect(DefaultChannelHandlerContext.java:491) at io.netty.channel.DefaultChannelHandlerContext.connect(DefaultChannelHandlerContext.java:476) at io.netty.channel.ChannelOutboundHandlerAdapter.connect(ChannelOutboundHandlerAdapter.java:47) at io.netty.channel.DefaultChannelHandlerContext.invokeConnect(DefaultChannelHandlerContext.java:491) at io.netty.channel.DefaultChannelHandlerContext.connect(DefaultChannelHandlerContext.java:476) at io.netty.channel.DefaultChannelHandlerContext.connect(DefaultChannelHandlerContext.java:461) at io.netty.channel.DefaultChannelPipeline.connect(DefaultChannelPipeline.java:847) at io.netty.channel.AbstractChannel.connect(AbstractChannel.java:198) at io.netty.bootstrap.Bootstrap$2.run(Bootstrap.java:165) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:354) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:366) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101) at java.lang.Thread.run(Thread.java:722) Caused by: java.util.concurrent.ExecutionException: java.net.SocketException: Network is unreachable at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:147) ... 19 more Caused by: java.net.SocketException: Network is unreachable at sun.nio.ch.Net.connect0(Native Method) at sun.nio.ch.Net.connect(Net.java:364) at sun.nio.ch.Net.connect(Net.java:356) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623) at io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:195) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:172) ... 14 more ]] Thoughts? Cheers, Michael -- Michael Hausenblas Ireland, Europe http://mhausenblas.info/ On 27 Oct 2013, at 22:48, Steven Phillips wrote: > Actually, I am wrong, Drill does not start a zookeeper when running in > local mode. The LocalClusterCoordinator does not use zookeeper at all. > > > On Sun, Oct 27, 2013 at 3:44 PM, Steven Phillips wrote: > >> Drill will start a zookeeper only in embedded mode. For example, running >> sqlline using parquet-local will launch a drillbit and zk all within one >> JVM. >> >> But to run a standalone drillbit requires an external zookeeper. >> >> >> On Sun, Oct 27, 2013 at 3:39 PM, Michael Hausenblas < >> michael.hausenblas@gmail.com> wrote: >> >>> >>> Maybe I'm dense but I thought Drill starts a ZK? Or do I have to install >>> and launch ZK separately? >>> >>> I'm using the binary version of M1. Run all things local only on my >>> laptop ... >>> >>> Cheers, >>> Michael >>> >>> Sent from my iPad >>> >>> -- >>> Michael Hausenblas, http://mhausenblas.info >>> >>>> On 27 Oct 2013, at 22:17, Steven Phillips >>> wrote: >>>> >>>> You need to replace localhost with the hostname of the node running >>>> zookeeper. If that zookeeper is configured to use a port different than >>>> 2181, then that needs to be set as well. If you have multiple >>> zookeepers in >>>> the quorum, you then zk.connect should be a comma separated list of the >>>> host:port of each node. >>>> >>>> The default, localhost setting will only work when a drillbit is >>> running on >>>> the same node as the zookeeper. >>>> >>>> >>>> On Sun, Oct 27, 2013 at 2:57 PM, Michael Hausenblas < >>>> michael.hausenblas@gmail.com> wrote: >>>> >>>>> >>>>>> One thing to add to the diagram is that all of the drill java >>> processes >>>>> will look at what is in drill-override.conf. >>>>> >>>>> Thanks, done. >>>>> >>>>> >>>>>> You must set zk.connect to the correct zk host:port. >>>>> >>>>> >>>>> Can you be a tad more explicit, please? In drill-override.conf I have >>>>> >>>>> [[ >>>>> … >>>>> zk: { >>>>> connect: "localhost:2181”, >>>>> … >>>>> ]] >>>>> >>>>> >>>>> What am I overlooking? >>>>> >>>>> Also, any directions re the rest of my questions (re bin/submit_plan >>> etc.)? >>>>> >>>>> >>>>> With a little help from here, I’m happy to put together the >>> description >>>>> how to set this up in the Wiki, also to address a query we’ve now lying >>>>> around for more than three weeks, by Steve McPherson – see >>>>> >>> http://mail-archives.apache.org/mod_mbox/incubator-drill-user/201310.mbox/%3CCE71A20F.14F5B%25stevemp%40amazon.com%3E–the fact that it attracted 0 responses I find slightly embarrassing, and >>>>> if I were Steve, I’d prolly not touch Drill anymore, but let’s hope >>> for the >>>>> best … >>>>> >>>>> >>>>> Cheers, >>>>> Michael >>>>> >>>>> -- >>>>> Michael Hausenblas >>>>> Ireland, Europe >>>>> http://mhausenblas.info/ >>>>> >>>>>> On 27 Oct 2013, at 21:32, Steven Phillips >>> wrote: >>>>>> >>>>>> One thing to add to the diagram is that all of the drill java >>> processes >>>>>> will look at what is in drill-override.conf. You must set zk.connect >>> to >>>>> the >>>>>> correct zk host:port. >>>>>> >>>>>> >>>>>> On Sun, Oct 27, 2013 at 2:00 PM, Michael Hausenblas < >>>>>> michael.hausenblas@gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> Folks, >>>>>>> >>>>>>> I’m trying to set up Drill in distributed mode. Here’s what I have so >>>>> far: >>>>>>> when I launch the first Drillbit with bin/drillbit.sh I get the >>>>> following >>>>>>> in log/drillbit.out: >>>>>>> >>>>>>> [[ >>>>>>> 20:47:20.963 [main] ERROR com.netflix.curator.ConnectionState - >>>>> Connection >>>>>>> timed out for connection string (localhost:2181) and timeout (5000) / >>>>>>> elapsed (5045) >>>>>>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>>>>>> KeeperErrorCode = ConnectionLoss >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.ConnectionState.getZooKeeper(ConnectionState.java:94) >>>>>>> ~[curator-client-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:106) >>>>>>> [curator-client-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:393) >>>>>>> [curator-framework-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:184) >>>>>>> [curator-framework-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:173) >>>>>>> [curator-framework-1.1.9.jar:na] >>>>>>> at >>> com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:85) >>>>>>> [curator-client-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:169) >>>>>>> [curator-framework-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:161) >>>>>>> [curator-framework-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:36) >>>>>>> [curator-framework-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.x.discovery.details.ServiceDiscoveryImpl.getChildrenWatched(ServiceDiscoveryImpl.java:306) >>>>>>> [curator-x-discovery-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.x.discovery.details.ServiceDiscoveryImpl.queryForInstances(ServiceDiscoveryImpl.java:276) >>>>>>> [curator-x-discovery-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.x.discovery.details.ServiceCache.refresh(ServiceCache.java:193) >>>>>>> [curator-x-discovery-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> com.netflix.curator.x.discovery.details.ServiceCache.start(ServiceCache.java:116) >>>>>>> [curator-x-discovery-1.1.9.jar:na] >>>>>>> at >>>>>>> >>>>> >>> org.apache.drill.exec.coord.ZKClusterCoordinator.start(ZKClusterCoordinator.java:89) >>>>>>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1] >>>>>>> at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:94) >>>>>>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1] >>>>>>> at >>> org.apache.drill.exec.server.Drillbit.start(Drillbit.java:56) >>>>>>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1] >>>>>>> at >>> org.apache.drill.exec.server.Drillbit.start(Drillbit.java:43) >>>>>>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1] >>>>>>> at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:65) >>>>>>> [java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1] >>>>>>> ]] >>>>>>> >>>>>>> This seems to be a known issue? See >>>>>>> >>>>> >>> http://stackoverflow.com/questions/16056751/curator-zookeeper-client-keeps-throw-out-connectionlossexception-per-connection >>>>>>> >>>>>>> Any ideas? Did anyone actually run Drill in distributed mode already >>> and >>>>>>> if so, how did you overcome the above issue? >>>>>>> >>>>>>> What is next? How do I make other Drillbits point to the same ZK >>>>> cluster? >>>>>>> And has anyone an example of the call parameters for bin/submit_plan >>>>> maybe >>>>>>> as well? >>>>>>> >>>>>>> >>>>>>> BTW, in the process of trying to figure what’s going on behind the >>>>> scene I >>>>>>> traced down the startup call dependencies (scripts), available via: >>>>>>> >>>>>>> >>>>>>> >>>>> >>> https://docs.google.com/drawings/d/1-ADIGJ-lBr-dOrOjMpQlProiZjYjjuM0kR6A81BYwKA/edit?usp=sharing >>>>>>> >>>>>>> which we could then also use for documentation purposes. >>>>>>> >>>>>>> >>>>>>> Cheers, >>>>>>> Michael >>>>>>> >>>>>>> -- >>>>>>> Michael Hausenblas >>>>>>> Ireland, Europe >>>>>>> http://mhausenblas.info/ >>>>>>> >>>>>>> >>>>> >>>>> >>> >> >>