drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ron Cecchini <roncecch...@comcast.net>
Subject Re: Problem running Drill in a Docker container in OpenShift
Date Fri, 31 Jan 2020 21:40:27 GMT
Thank you, Paul, for your in depth and informative response.

Here's what we did today as a test:

In OpenShift, we enabled allowing containers to run however they wanted and redeployed the
Drill Docker.

Note:
- this is the 1.17.0 image we pulled directly from Docker Hub, not the image I made from that
image
- we are using 2 persistent volumes (PV) for /opt/drill/conf and /opt/drill/log
- I copied everything that was in the source code's (from GitHub) 'resources' directory into
the PV mapped to  /opt/drill/conf.  (FWIW, there are more files in that 'resources' directory
than in the container's original /opt/drill/conf.)

The result is that the container comes up - shows that the user.name is indeed root - doesn't
give any errors ...

And then seemingly goes away, again w/o any indication of an error.  And then OpenShift goes
into its crash-reboot cycle trying to restart the container, and each time we see the same
thing: 

The console in OpenShift displays:

  Apache Drill 1.17.0
  "You won't meet Santa, but Drill supports Clauses."

which obviously looks good.

But the sqlline.log, while not indicating anything bad, seems to go into a graceful shutdown
immediately after connecting.  I turned up logging on sqlline to DEBUG and will include the
tail end of the log below.

The only other clue I have is that if I go into the OpenShift debug shell for the Drill container,
and execute /opt/drill/bin/drill-embedded, then Drill comes up and stays up (only while that
debug shell is up).

In the meantime, I think we may be looking to install Drill from the tar.gz and run it on
a dedicated server outside of OpenShift.

Thank you again.

-----

[...]
2020-01-31 21:32:54,030 [main] DEBUG o.a.d.e.w.f.r.ResourceManagerBuilder - No query queueing
enabled.
2020-01-31 21:32:54,032 [main] INFO  o.apache.drill.exec.server.Drillbit - Startup completed
(5787 ms).
2020-01-31 21:32:54,034 [main] WARN  o.a.drill.exec.metrics.DrillMetrics - Removing old metric
since name matched newly registered metric. Metric name: drill.allocator.root.used
2020-01-31 21:32:54,034 [main] WARN  o.a.drill.exec.metrics.DrillMetrics - Removing old metric
since name matched newly registered metric. Metric name: drill.allocator.root.peak
2020-01-31 21:32:54,048 [main] DEBUG org.apache.drill.exec.ssl.SSLConfig - Initialized SSL
context.
2020-01-31 21:32:54,048 [main] DEBUG o.a.drill.exec.client.DrillClient - Connecting to server
XXXXXXX-16-6xqlt:31010
2020-01-31 21:32:54,140 [UserServer-1] DEBUG o.a.drill.exec.memory.BoundsChecking - Direct
memory bounds checking is disabled.
2020-01-31 21:32:54,152 [Client-1] DEBUG o.a.d.e.rpc.ConnectionMultiListener - Handshake completed
successfully.
2020-01-31 21:32:54,153 [main] INFO  o.a.drill.exec.client.DrillClient - Foreman drillbit
is XXXXXXX-16-6xqlt
2020-01-31 21:32:54,153 [main] INFO  o.a.drill.exec.client.DrillClient - Successfully connected
to server XXXXXXX-16-6xqlt:31010
2020-01-31 21:32:54,382 [main] DEBUG o.apache.drill.exec.rpc.BasicClient - Closing client
2020-01-31 21:32:54,385 [main] DEBUG o.apache.drill.exec.server.Drillbit - Shutdown begun.
2020-01-31 21:32:55,416 [main] INFO  o.a.drill.exec.compile.CodeCompiler - Stats: code gen
count: 0, cache miss count: 0, hit rate: 0%
2020-01-31 21:32:55,428 [main] INFO  o.apache.drill.exec.server.Drillbit - Shutdown completed
(1042 ms).
2020-01-31 21:32:55,432 [Drillbit-Graceful-Shutdown#Thread-6] DEBUG o.apache.drill.exec.server.Drillbit
- Graceful Shutdown thread was interrupted
java.lang.InterruptedException: null
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048)
	at java.util.concurrent.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:492)
	at java.util.concurrent.LinkedBlockingDeque.take(LinkedBlockingDeque.java:680)
	at sun.nio.fs.AbstractWatchService.take(AbstractWatchService.java:118)
	at org.apache.drill.exec.server.Drillbit$GracefulShutdownThread.pollShutdown(Drillbit.java:427)
	at org.apache.drill.exec.server.Drillbit$GracefulShutdownThread.run(Drillbit.java:401)


> On January 31, 2020 at 2:06 AM Paul Rogers <par0328@yahoo.com.INVALID> wrote:
> 
> 
> Hi Ron,
> 
> Actually, helping us track down this issue is a great contribution in itself.
> 
> My Docker is a bit rusty, but I just did a quick check of Drill's two Dockerfiles. Neither
seem designed for a production deployment: one is a build image, the other a simple embedded
image. Neither set a user.
> 
> I suspect we need to create an image for this use case that:
> 
> 1. Installs dependencies (JDK, etc.)
> 2. Creates a "drill" user.
> 3. Downloads and unpacks the target Drill distribution.
> 4. Changes owner of the distribution to user "drill".
> 5. Runs Drill as "user drill."
> 
> For historical reasons, lots of Drill QA and developers deployed Drill as root when running
on VMs, looks like the Docker image may have followed that pattern. But, as you noted, running
anything as root, even in a container, is usually frowned upon.
> 
> I did a quick check of some other community Dockerfiles. As expected, they mostly follow
the same steps as above (customized, of course, for the particular project.)
> 
> 
> I think you said you are new to Docker. You can try to create an image that does the
above; it's not hard (just lots of details to learn). Otherwise, I may have some time to take
a crack at it in a few days. Or, perhaps one of the other Drill devs. can try it sooner. 
> 
> 
> Your goal is to run Drill as part of your app. To do that, we'll want to option to host-mount
certain directories. For example, it might be helpful to write query profiles and logs to
a host directory so that these files survive a container failure.
> 
> OpenShift is a wrapper around Kubernetes. So, our Drill image should be designed to run
under K8s. For example, we should use K8s config maps to pass in things like the Drill config
file (so you can run Drill distributed, pointed to a ZK), and to use K8s to set things like
the Drill memory options. (Otherwise, you'd have to rebuild the container to grab a new drill-override.conf
or drill-overide.sh file any time you want to change a setting. OK for playing around, not
scalable in production.
> 
> Your data will be stored somewhere. Ceph? Using its S3 look-alike API? If so, you'll
need to pass that config into your container and define the required storage plugin. You can
do that statically each time, or you can run ZK so that storage plugins and other config survive
from one container run to the next.
> 
> 
> I can take a crack at adding the needed Dockerfile features, and the K8s setup, once
we get the basics to work. Or, if another Drill dev wants to try, start with a one-node K8s
on your PC or test machine.
> 
> In the mean time, one of the nice things about Drill is you can actually try things out
on your own laptop or VM. Grab a few Parquet files of the type you plan to use. Or, configure
drill to read from your distributed file system. Run some test queries. Get the hang of how
Drill works so you are comfortable with using Drill once we get your containers to run.
> 
> 
> Thanks,
> - Paul
> 
>  
> 
>     On Thursday, January 30, 2020, 10:14:08 PM PST, Ron Cecchini <roncecchini@comcast.net>
wrote:  
>  
>  So, apparently I spoke too soon, as my image-from-an-image in OpenShift actually *didn't*
start up successfully.
> 
> I'll take Ted's and Charles' comments in the current "[DISCUSS]" thread regarding attracting
users who may never contribute a line of code but nevertheless report on their use of Drill
and problems they may encounter as license to ask a follow up question...  Besides, with
the growing usage of OpenShift, I'm sure I won't be the last one trying to do this!  
> 
> You guys were talking about attracting enthusiastic users who want to spread the word
to their friends.  Well, as a bit of a backstory, it was an enthusiastic analyst here who
floated the idea of replacing our MongoDB + JSON with Drill + Parquet.  Eventually our tech
lead signed on to the idea and I got tasked to help out.  If this all works out, I'm sure
word will spread to other analysts.  So here I am...  (I don't mind being the "canary in
the coalmine", so to speak.  I've often had to take on that role.  And I get to learn new
things...)
> 
> Ok, enough blah-blah-blah.  The problem I'm having, as far as I can tell, stems from
the fact that OpenShift doesn't set a user name when running a container.  In particular,
the Java System.getProperty("user.name") and "user.home" calls return "?" as reported in the
Zookeeper (I'm in embedded mode but the Zookeeper config is still being filled in) section
of the sqlline.log:
> 
>  [main] INFO  o.apache.drill.exec.server.Drillbit - Drillbit environment: user.name=?
>  [main] INFO  o.apache.drill.exec.server.Drillbit - Drillbit environment: user.home=?
> 
> As a quick point of comparison, when I run a Drill Docker on my desktop (not in OpenShift),
with zero config changes, everything of course works fine, and "user.name" is "root" and "user.home"
is "/root".  (Probably because I installed Docker with a "sudo yum install".)  Similarly,
when I run an embedded-Drill on my desktop installed from the source *tar.gz - i.e. not in
a Docker, and not in OpenShift - "user.name" is my $USER and "user.home" is my $HOME, and
everything again runs fine.
> 
> I'll include the big stack trace at the bottom, but really the only question I have at
the moment is:
> 
> Given that I don't think I can get OpenShift to set or pass in a "user.name" property
for the JDK to get, is there an environment variable (OpenShift *can* pass those in) or a
magical setting in drill-override.conf that will let me get past this "invalid null input:
name" problem I'm running into during the "login" phase of the startup?
> 
> And if so, could the "user.name" be set to anything or would it have to be root?  (I
don't quite understand the relationship between the "user.name" and the Hadoop login, etc.)
> 
> Thank you so much for any help!
> 
> Ron
> 
> --------------------------------------------------------------------------------
> 
> Error: Failure in starting embedded Drillbit: org.apache.drill.exec.exception.DrillbitStartupException:
Failed to login. (state=,code=0)
> java.sql.SQLException: Failure in starting embedded Drillbit: org.apache.drill.exec.exception.DrillbitStartupException:
Failed to login.
>     at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:143)
>     at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
>     at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
>     at org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
>     at org.apache.drill.jdbc.Driver.connect(Driver.java:75)
>     at sqlline.DatabaseConnection.connect(DatabaseConnection.java:135)
>     at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:192)
>     at sqlline.Commands.connect(Commands.java:1364)
>     at sqlline.Commands.connect(Commands.java:1244)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
>     at sqlline.SqlLine.dispatch(SqlLine.java:730)
>     at sqlline.SqlLine.initArgs(SqlLine.java:410)
>     at sqlline.SqlLine.begin(SqlLine.java:515)
>     at sqlline.SqlLine.start(SqlLine.java:267)
>     at sqlline.SqlLine.main(SqlLine.java:206)
> Caused by: org.apache.drill.exec.exception.DrillbitStartupException: Failed to login.
>     at org.apache.drill.exec.server.BootStrapContext.login(BootStrapContext.java:161)
>     at org.apache.drill.exec.server.BootStrapContext.<init>(BootStrapContext.java:82)
>     at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:171)
>     at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:135)
>     at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:133)
>     ... 18 more
> Caused by: org.apache.hadoop.security.KerberosAuthException: failure to login: javax.security.auth.login.LoginException:
java.lang.NullPointerException: invalid null input: name
>     at com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
>     at com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:133)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
>     at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
>     at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
>     at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
>     at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
>     at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:1926)
>     at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1837)
>     at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:710)
>     at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:660)
>     at org.apache.drill.exec.server.BootStrapContext.login(BootStrapContext.java:156)
>     at org.apache.drill.exec.server.BootStrapContext.<init>(BootStrapContext.java:82)
>     at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:171)
>     at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:135)
>     at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:133)
>     at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
>     at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
>     at org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
>     at org.apache.drill.jdbc.Driver.connect(Driver.java:75)
>     at sqlline.DatabaseConnection.connect(DatabaseConnection.java:135)
>     at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:192)
>     at sqlline.Commands.connect(Commands.java:1364)
>     at sqlline.Commands.connect(Commands.java:1244)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
>     at sqlline.SqlLine.dispatch(SqlLine.java:730)
>     at sqlline.SqlLine.initArgs(SqlLine.java:410)
>     at sqlline.SqlLine.begin(SqlLine.java:515)
>     at sqlline.SqlLine.start(SqlLine.java:267)
>     at sqlline.SqlLine.main(SqlLine.java:206)
> 
>     at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1847)
>     at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:710)
>     at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:660)
>     at org.apache.drill.exec.server.BootStrapContext.login(BootStrapContext.java:156)
>     ... 22 more
> Caused by: javax.security.auth.login.LoginException: java.lang.NullPointerException:
invalid null input: name
>     at com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
>     at com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:133)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
>     at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
>     at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
>     at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
>     at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
>     at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:1926)
>     at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1837)
>     at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:710)
>     at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:660)
>     at org.apache.drill.exec.server.BootStrapContext.login(BootStrapContext.java:156)
>     at org.apache.drill.exec.server.BootStrapContext.<init>(BootStrapContext.java:82)
>     at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:171)
>     at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:135)
>     at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:133)
>     at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
>     at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
>     at org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
>     at org.apache.drill.jdbc.Driver.connect(Driver.java:75)
>     at sqlline.DatabaseConnection.connect(DatabaseConnection.java:135)
>     at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:192)
>     at sqlline.Commands.connect(Commands.java:1364)
>     at sqlline.Commands.connect(Commands.java:1244)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
>     at sqlline.SqlLine.dispatch(SqlLine.java:730)
>     at sqlline.SqlLine.initArgs(SqlLine.java:410)
>     at sqlline.SqlLine.begin(SqlLine.java:515)
>     at sqlline.SqlLine.start(SqlLine.java:267)
>     at sqlline.SqlLine.main(SqlLine.java:206)
> 
>     at javax.security.auth.login.LoginContext.invoke(LoginContext.java:856)
>     at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
>     at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
>     at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
>     at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
>     at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:1926)
>     at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1837)
>     ... 25 more
> Apache Drill 1.17.0
> "You told me to, Drill Sergeant!"
> 
> > On January 29, 2020 at 5:47 PM Ron Cecchini <roncecchini@comcast.net> wrote:
> > 
> > Sorry for the spam, but I think I figured it out.  
> > 
> > Thank you so much for your suggestions to build an image from an image.  I finally
put 2 & 2 together and realized what you were saying and created the following Dockerfile. 
I then built and pushed the image into OpenShift - and it started up nicely.  
> > 
> > I haven't had a chance to test it yet, but I'm optimistic.
> > 
> > Thank you again.
> > 
> > ---
> > 
> > Dockerfile:
> > 
> > # Use the latest official release of Apache Drill
> > FROM apache/drill:1.17.0
> > 
> > # Make /opt/drill readable by everyone due to OpenShift's security
> > RUN chgrp -R 0 /opt/drill && chmod -R g=u /opt/drill
> > 
> > # Start Drill in embedded mode and connects to Sqlline
> > ENTRYPOINT /opt/drill/bin/drill-embedded
>

Mime
View raw message