drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ron Cecchini <roncecch...@comcast.net>
Subject Re: Problem running Drill in a Docker container in OpenShift
Date Wed, 29 Jan 2020 22:47:20 GMT
Sorry for the spam, but I think I figured it out.  

Thank you so much for your suggestions to build an image from an image.  I finally put 2 &
2 together and realized what you were saying and created the following Dockerfile.  I then
built and pushed the image into OpenShift - and it started up nicely.  

I haven't had a chance to test it yet, but I'm optimistic.

Thank you again.

---

Dockerfile:

# Use the latest official release of Apache Drill
FROM apache/drill:1.17.0

# Make /opt/drill readable by everyone due to OpenShift's security
RUN chgrp -R 0 /opt/drill && chmod -R g=u /opt/drill

# Start Drill in embedded mode and connects to Sqlline
ENTRYPOINT /opt/drill/bin/drill-embedded

> On January 29, 2020 at 3:05 PM Ron Cecchini <roncecchini@comcast.net> wrote:
> 
> 
> Thank you, Paul and Volodymyr.  I'll answer all of your questions below.  (Warning: It
gets a little long...)
> 
> So, first of all, I am behind a proxied firewall, and I am trying to do this build on
my RHEL/CentOS 7 development machine, and then replicate the steps on another ("build") machine
and deploy it into an OpenShift environment, both of which are also running on CentOS 7.
> 
> And we are looking to run Drill only in embedded mode.
> 
> > Please explain a bit more about the file permissions issue.
> > Is the file owned by a user other than the one that runs Drill?
> > If so, sounds like a bug, unless OpenShift uses a different user than plain Docker
would.
> 
> Per the OpenShift docs below, OpenShift indeed runs the container as a different user
than Docker would.
> 
> https://docs.openshift.com/container-platform/3.11/creating_images/guidelines.html#openshift-specific-guidelines
> 
>     Support Arbitrary User IDs
> 
>     By default, OpenShift Container Platform runs containers using an arbitrarily assigned
user ID.
>     This provides additional security against processes escaping the container due to
a container
>     engine vulnerability and thereby achieving escalated permissions on the host node.
> 
>     For an image to support running as an arbitrary user, directories and files that
may be written
>     to by processes in the image should be owned by the root group and be read/writable
by that group.
>     Files to be executed should also have group execute permissions.
> 
>     Adding the following to your Dockerfile sets the directory and file permissions to
allow users
>     in the root group to access them in the built image:
> 
>     RUN chgrp -R 0 /some/directory && chmod -R g=u /some/directory
> 
> ...
> 
> OpenShift does provide a way to run containers as root -- but we're absolutely trying
to avoid that.
> 
> So it was that suggestion to add the 'chgrp' and 'chmod' to the Dockerfile that seemed
to be the best and easiest solution.
>  
> > I believe our standard image is for building Drill.
> > What you want is an image that uses an existing Drill build.
> > The "SNAPSHOT" refers to the current master version of the code, which is probably
not what you want.
> > You want the released 1.17 binaries since you want to use, not develop, Drill.
> 
> Thank you for the suggestion.  You were right, I was sitting on the master branch.
> 
> So I checked out the 1.17.0 branch.  
> 
> Then I saw that there was no Dockerfile in this branch, so I copied the one I had been
using (including the added chgrp/chmod), tried the 'docker build' -- and again got a similar
build error regarding the parent pom.
> 
> But looking at the error, it seems to be more an issue with finding the specified Apache
POM in the conjars.org/repo.  The following simply does not exist:
> 
>     http://conjars.org/repo/org/apache/apache/21/apache-21.pom   
> 
> (And on a related sidenote, I've built the apache/camel project, and its parent POM has
an essentially identical '<parent><groupId>org.apache' section as yours - and
it doesn't barf with a "parent POM" problem - so that can't be the problem.  It's obviously
something to do with this connection to conjars.org.)
> 
> This is how I tried building the Drill Docker and the FATAL error I get:
> 
>     % git branch
> 
>     * (detached from origin/1.17.0)
>       master
> 
>     % docker build -t mydrill .
> 
>     Sending build context to Docker daemon  199.2MB
>     Step 1/10 : FROM maven:3.6-jdk-8 as build
>      ---> 5042e76d4104
>     Step 2/10 : COPY . /src
>      ---> 9a373ca8c131
>     Step 3/10 : WORKDIR /src
>      ---> Running in 982bd7f24911
>     Removing intermediate container 982bd7f24911
>      ---> 0b023e6084b1
>     Step 4/10 : RUN  mvn clean install -DskipTests -q
>      ---> Running in f2840c1bb274
>     [ERROR] [ERROR] Some problems were encountered while processing the POMs:
>     [FATAL] Non-resolvable parent POM for org.apache.drill:drill-root:1.17.0:
>     Could not transfer artifact org.apache:apache:pom:21 from/to conjars (http://conjars.org/repo):
>     Transfer failed for http://conjars.org/repo/org/apache/apache/21/apache-21.pom and
>     'parent.relativePath' points at no local POM @ line 24, column 11:
>     Connect to conjars.org:80 [conjars.org/184.73.255.175, conjars.org/54.225.137.155]
failed:
>     Connection timed out (Connection timed out) -> [Help 2]
> 
> 
> > Short term, the best solution would be if you can build an image based on the existing
Drill image rather than making a copy of the Dockerfile.
> 
> Since I'm essentially new to Docker as well, how would I build a new image -- using the
1.17.0 image I already have in my local Docker repo (instead of building from source) -- and
also doing the chgrp/chmod I need to do?
> 
> (I did figure out how to use 'docker save apache/drill > apache-drill.tar' to pull
apart a container, and inspected all the layers to find the apache-drill-1.17.0.tar.gz you're
referring to.  But hopefully I don't need to go down *that* path of trying to build an image
from scratch using your jars, and the 3rd party jars you include, etc, and hopefully there's
a simpler way of building an image based on your image.)
> 
> Or maybe the solution to the conjars.org maven build error is trivial and I can try that
again.
> 
> Thank you again so much!
> 
> Ron
> 
> > If we understand the original file permission problem, perhaps we can find a way
to fix that.
> 
> > On January 29, 2020 at 2:10 AM Paul Rogers <par0328@yahoo.com.INVALID> wrote:
> > 
> > 
> > Hi Ron,
> > 
> > I don't think anyone on the Drill team has access to an OpenShift environment. Let's
see if we can use your work to ensure that the Docker image supports OpenShift in the future.
> > 
> > Please explain a bit more about the file permissions issue. Is the file owned by
a user other than the one that runs Drill? If so, sounds like a bug, unless OpenShift uses
a different user than plain Docker would.
> > 
> > 
> > I believe our standard image is for building Drill. What you want is an image that
uses an existing Drill build. The "SNAPSHOT" refers to the current master version of the code,
which is probably not what you want. You want the released 1.17 binaries since you want to
use, not develop, Drill.
> > 
> > Question for the team: do we have a separate image for folks who want to run the
latest 1.17 release?
> > 
> > Short term, the best solution would be if you can build an image based on the existing
Drill image rather than making a copy of the Dockerfile. If we understand the original file
permission problem, perhaps we can find a way to fix that.
> > 
> > Are you looking to run Drill in embedded mode (Sqlline in a container, you ssh into
the container; config lost on Drill shutdown) or in server mode (config stored in ZK so it
persists across container runs)?
> > 
> > 
> > Thanks,
> > - Paul
> > 
> >  
> > 
> >     On Tuesday, January 28, 2020, 9:41:28 PM PST, Ron Cecchini <roncecchini@comcast.net>
wrote:  
> >  
> >  
> > Hi, all.  Drill and OpenShift newbie here.
> > 
> > Has anyone successfully deployed a Drill Docker container to an OpenShift environment?
> > 
> > While there is information about Drill Docker, there seems to be zero information
about OpenShift in particular.
> > 
> > Per the instructions at drill.apache.org/docs/running-drill-on-docker, I pulled
the Drill Docker image from Docker Hub, and then pushed it to our OpenShift environment. 
But when I tried to deploy it, I immediately ran into an error about /opt/drill/conf/drill-override.conf
not being readable.
> > 
> > I understand why the problem is happening (because of who OpenShift runs the container
as), so I downloaded the source from GitHub and modified the Dockerfile to include:
> > 
> >     RUN chgrp -R 0 /opt/drill && chmod -R g=u /opt/drill
> > 
> > so that all of /opt/drill would be available to everyone.  But then 'docker build'
kept failing, giving the error:
> > 
> >     Non-resolvable parent POM for org.apache.drill:drill-root:1.18.0-SNAPSHOT:
> >     Could not transfer artifact org.apache:apache:pom:21
> > 
> > I tried researching that error but couldn't figure out what was going on.  So I
finally decided to start trying to mount persistent volumes, creating one PV for /opt/drill/conf
(and then copying the default drill-override.conf there) and one PV for /opt/drill/log.
> > 
> > Now the container gets much further, but eventually fails on something Hadoop related. 
I'm not trying to do anything with Hadoop, so I don't know what that's about, but it says
I don't have HADOOP_HOME set.
> > 
> > Hopefully I can figure out the remaining steps I need (an environment variable? 
more configs?), but I was wondering if anyone else had already successfully figured out how
to deploy to OpenShift, or might know why the 'docker build' fails with that error?
> > 
> > For what it's worth, I copied over only that drill-override.conf and nothing else. 
And I did not set any Drill environment variables in OpenShift.  I'm basically trying to
run the "vanilla" Drill Docker as-is.
> > 
> > Thanks for any help!
> > 
> > Ron
> >

Mime
View raw message