drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Rogers <par0...@yahoo.com.INVALID>
Subject Re: Problem running Drill in a Docker container in OpenShift
Date Fri, 31 Jan 2020 07:06:33 GMT
Hi Ron,

Actually, helping us track down this issue is a great contribution in itself.

My Docker is a bit rusty, but I just did a quick check of Drill's two Dockerfiles. Neither
seem designed for a production deployment: one is a build image, the other a simple embedded
image. Neither set a user.

I suspect we need to create an image for this use case that:

1. Installs dependencies (JDK, etc.)
2. Creates a "drill" user.
3. Downloads and unpacks the target Drill distribution.
4. Changes owner of the distribution to user "drill".
5. Runs Drill as "user drill."

For historical reasons, lots of Drill QA and developers deployed Drill as root when running
on VMs, looks like the Docker image may have followed that pattern. But, as you noted, running
anything as root, even in a container, is usually frowned upon.

I did a quick check of some other community Dockerfiles. As expected, they mostly follow the
same steps as above (customized, of course, for the particular project.)


I think you said you are new to Docker. You can try to create an image that does the above;
it's not hard (just lots of details to learn). Otherwise, I may have some time to take a crack
at it in a few days. Or, perhaps one of the other Drill devs. can try it sooner. 


Your goal is to run Drill as part of your app. To do that, we'll want to option to host-mount
certain directories. For example, it might be helpful to write query profiles and logs to
a host directory so that these files survive a container failure.

OpenShift is a wrapper around Kubernetes. So, our Drill image should be designed to run under
K8s. For example, we should use K8s config maps to pass in things like the Drill config file
(so you can run Drill distributed, pointed to a ZK), and to use K8s to set things like the
Drill memory options. (Otherwise, you'd have to rebuild the container to grab a new drill-override.conf
or drill-overide.sh file any time you want to change a setting. OK for playing around, not
scalable in production.

Your data will be stored somewhere. Ceph? Using its S3 look-alike API? If so, you'll need
to pass that config into your container and define the required storage plugin. You can do
that statically each time, or you can run ZK so that storage plugins and other config survive
from one container run to the next.


I can take a crack at adding the needed Dockerfile features, and the K8s setup, once we get
the basics to work. Or, if another Drill dev wants to try, start with a one-node K8s on your
PC or test machine.

In the mean time, one of the nice things about Drill is you can actually try things out on
your own laptop or VM. Grab a few Parquet files of the type you plan to use. Or, configure
drill to read from your distributed file system. Run some test queries. Get the hang of how
Drill works so you are comfortable with using Drill once we get your containers to run.


Thanks,
- Paul

 

    On Thursday, January 30, 2020, 10:14:08 PM PST, Ron Cecchini <roncecchini@comcast.net>
wrote:  
 
 So, apparently I spoke too soon, as my image-from-an-image in OpenShift actually *didn't*
start up successfully.

I'll take Ted's and Charles' comments in the current "[DISCUSS]" thread regarding attracting
users who may never contribute a line of code but nevertheless report on their use of Drill
and problems they may encounter as license to ask a follow up question...  Besides, with
the growing usage of OpenShift, I'm sure I won't be the last one trying to do this!  

You guys were talking about attracting enthusiastic users who want to spread the word to their
friends.  Well, as a bit of a backstory, it was an enthusiastic analyst here who floated
the idea of replacing our MongoDB + JSON with Drill + Parquet.  Eventually our tech lead
signed on to the idea and I got tasked to help out.  If this all works out, I'm sure word
will spread to other analysts.  So here I am...  (I don't mind being the "canary in the
coalmine", so to speak.  I've often had to take on that role.  And I get to learn new things...)

Ok, enough blah-blah-blah.  The problem I'm having, as far as I can tell, stems from the
fact that OpenShift doesn't set a user name when running a container.  In particular, the
Java System.getProperty("user.name") and "user.home" calls return "?" as reported in the Zookeeper
(I'm in embedded mode but the Zookeeper config is still being filled in) section of the sqlline.log:

 [main] INFO  o.apache.drill.exec.server.Drillbit - Drillbit environment: user.name=?
 [main] INFO  o.apache.drill.exec.server.Drillbit - Drillbit environment: user.home=?

As a quick point of comparison, when I run a Drill Docker on my desktop (not in OpenShift),
with zero config changes, everything of course works fine, and "user.name" is "root" and "user.home"
is "/root".  (Probably because I installed Docker with a "sudo yum install".)  Similarly,
when I run an embedded-Drill on my desktop installed from the source *tar.gz - i.e. not in
a Docker, and not in OpenShift - "user.name" is my $USER and "user.home" is my $HOME, and
everything again runs fine.

I'll include the big stack trace at the bottom, but really the only question I have at the
moment is:

Given that I don't think I can get OpenShift to set or pass in a "user.name" property for
the JDK to get, is there an environment variable (OpenShift *can* pass those in) or a magical
setting in drill-override.conf that will let me get past this "invalid null input: name" problem
I'm running into during the "login" phase of the startup?

And if so, could the "user.name" be set to anything or would it have to be root?  (I don't
quite understand the relationship between the "user.name" and the Hadoop login, etc.)

Thank you so much for any help!

Ron

--------------------------------------------------------------------------------

Error: Failure in starting embedded Drillbit: org.apache.drill.exec.exception.DrillbitStartupException:
Failed to login. (state=,code=0)
java.sql.SQLException: Failure in starting embedded Drillbit: org.apache.drill.exec.exception.DrillbitStartupException:
Failed to login.
    at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:143)
    at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
    at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
    at org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
    at org.apache.drill.jdbc.Driver.connect(Driver.java:75)
    at sqlline.DatabaseConnection.connect(DatabaseConnection.java:135)
    at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:192)
    at sqlline.Commands.connect(Commands.java:1364)
    at sqlline.Commands.connect(Commands.java:1244)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
    at sqlline.SqlLine.dispatch(SqlLine.java:730)
    at sqlline.SqlLine.initArgs(SqlLine.java:410)
    at sqlline.SqlLine.begin(SqlLine.java:515)
    at sqlline.SqlLine.start(SqlLine.java:267)
    at sqlline.SqlLine.main(SqlLine.java:206)
Caused by: org.apache.drill.exec.exception.DrillbitStartupException: Failed to login.
    at org.apache.drill.exec.server.BootStrapContext.login(BootStrapContext.java:161)
    at org.apache.drill.exec.server.BootStrapContext.<init>(BootStrapContext.java:82)
    at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:171)
    at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:135)
    at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:133)
    ... 18 more
Caused by: org.apache.hadoop.security.KerberosAuthException: failure to login: javax.security.auth.login.LoginException:
java.lang.NullPointerException: invalid null input: name
    at com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
    at com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:133)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
    at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
    at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:1926)
    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1837)
    at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:710)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:660)
    at org.apache.drill.exec.server.BootStrapContext.login(BootStrapContext.java:156)
    at org.apache.drill.exec.server.BootStrapContext.<init>(BootStrapContext.java:82)
    at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:171)
    at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:135)
    at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:133)
    at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
    at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
    at org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
    at org.apache.drill.jdbc.Driver.connect(Driver.java:75)
    at sqlline.DatabaseConnection.connect(DatabaseConnection.java:135)
    at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:192)
    at sqlline.Commands.connect(Commands.java:1364)
    at sqlline.Commands.connect(Commands.java:1244)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
    at sqlline.SqlLine.dispatch(SqlLine.java:730)
    at sqlline.SqlLine.initArgs(SqlLine.java:410)
    at sqlline.SqlLine.begin(SqlLine.java:515)
    at sqlline.SqlLine.start(SqlLine.java:267)
    at sqlline.SqlLine.main(SqlLine.java:206)

    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1847)
    at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:710)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:660)
    at org.apache.drill.exec.server.BootStrapContext.login(BootStrapContext.java:156)
    ... 22 more
Caused by: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid
null input: name
    at com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
    at com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:133)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
    at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
    at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:1926)
    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1837)
    at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:710)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:660)
    at org.apache.drill.exec.server.BootStrapContext.login(BootStrapContext.java:156)
    at org.apache.drill.exec.server.BootStrapContext.<init>(BootStrapContext.java:82)
    at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:171)
    at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:135)
    at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:133)
    at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
    at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
    at org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
    at org.apache.drill.jdbc.Driver.connect(Driver.java:75)
    at sqlline.DatabaseConnection.connect(DatabaseConnection.java:135)
    at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:192)
    at sqlline.Commands.connect(Commands.java:1364)
    at sqlline.Commands.connect(Commands.java:1244)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
    at sqlline.SqlLine.dispatch(SqlLine.java:730)
    at sqlline.SqlLine.initArgs(SqlLine.java:410)
    at sqlline.SqlLine.begin(SqlLine.java:515)
    at sqlline.SqlLine.start(SqlLine.java:267)
    at sqlline.SqlLine.main(SqlLine.java:206)

    at javax.security.auth.login.LoginContext.invoke(LoginContext.java:856)
    at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
    at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:1926)
    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1837)
    ... 25 more
Apache Drill 1.17.0
"You told me to, Drill Sergeant!"

> On January 29, 2020 at 5:47 PM Ron Cecchini <roncecchini@comcast.net> wrote:
> 
> Sorry for the spam, but I think I figured it out.  
> 
> Thank you so much for your suggestions to build an image from an image.  I finally put
2 & 2 together and realized what you were saying and created the following Dockerfile. 
I then built and pushed the image into OpenShift - and it started up nicely.  
> 
> I haven't had a chance to test it yet, but I'm optimistic.
> 
> Thank you again.
> 
> ---
> 
> Dockerfile:
> 
> # Use the latest official release of Apache Drill
> FROM apache/drill:1.17.0
> 
> # Make /opt/drill readable by everyone due to OpenShift's security
> RUN chgrp -R 0 /opt/drill && chmod -R g=u /opt/drill
> 
> # Start Drill in embedded mode and connects to Sqlline
> ENTRYPOINT /opt/drill/bin/drill-embedded
  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message