hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elek, Marton" <e...@apache.org>
Subject Re: [DISCUSS] Docker build process
Date Fri, 22 Mar 2019 09:06:14 GMT

Thanks the answer,

I agree, sha256 based tags seems to be more safe and bumping versions
only after some tests.

Let's say we have multiple hadoop docker images:


If I understood well, your proposal is the following:

In case of any security issue in centos/jdk, or in case of any bug in
the apache/hadoop-runner base image (we have a few shell/python scripts

1) We need to wait until the next release to fix them (3.2.1) which
means all the previous images would be unsecure / bad forever (but still


2) in case of a serious problem a new release can be created from all
the lines (3.2.1, 3.1.3, 2.9.3, 2.8.6) with the help of all the release
managers. (old images remain the same).

But on the other hand the image creation would be as easy as activating
a new profile during the release. (As a contrast: Using separated repo a
new branch would be created and the version in the Dockerfile would be


ps: for the development (non published images) I am convinced that the
optional docker profile can be an easier way to create images. Will
create a similar plugin execution for this Dockerfile:


On 3/21/19 11:33 PM, Eric Yang wrote:
> The flexibility of date appended release number is equivalent to maven snapshot or Docker
latest image convention, machine can apply timestamp better than human.  By using the Jenkins
release process, this can be done with little effort.  For official release, it is best to
use Docker image digest id to ensure uniqueness.  E.g.
> FROM centos@sha256:67dad89757a55bfdfabec8abd0e22f8c7c12a1856514726470228063ed86593b 
> Developer downloaded released source would build with the same docker image without getting
side effects.  
> A couple years ago, RedHat has decided to fix SSL vulnerability in RedHat 6/7 by adding
extra parameter to disable certification validation in urllib2 python library and force certificate
signer validation on by default.  It completely broke Ambari agent and its self-signed certificate.
 Customers had to backtrack to pick up a specific version of python SSL library to keep their
production cluster operational.  Without doing the due-diligence of certify Hadoop code and
the OS image, there is wriggle room for errors.  OS update example is a perfect example that
we want the container OS image certified with Hadoop binary release to avoid the wriggle rooms.
 Snapshot release is ok to have wriggle room for developers, but I don't think that flexibility
is necessary for official release.
> Regards,
> Eric
> ´╗┐On 3/21/19, 2:44 PM, "Elek, Marton" <elek@apache.org> wrote:
>     > If versioning is done correctly, older branches can have the same docker subproject,
and Hadoop 2.7.8 can be released for older Hadoop branches.  We don't generate timeline paradox
to allow changing the history of Hadoop 2.7.1.  That release has passed and let it stay that
>     I understand your point but I am afraid that my concerns were not
>     expressed clearly enough (sorry for that).
>     Let's say that we use centos as the base image. In case of a security
>     problem on the centos side (eg. in libssl) or jdk side, I would rebuild
>     all the hadoop:2.x / hadoop:3.x images and republish them. Exactly the
>     same hadoop bytes but updated centos/jdk libraries.
>     I understand your concerns that in this case the an image with the same
>     tag (eg. hadoop:3.2.1) will be changed over the time. But this can be
>     solved by adding date specific postfixes (eg. hadoop:3.2.1-20190321 tag
>     would never change but hadoop:3.2.1 can be changed)
>     I know that it's not perfect, but this is widely used. For example the
>     centos:7 tag is not fixed but centos:7.6.1810 is (hopefully).
>     Without this flexibility any centos/jdk security issue can invalidate
>     all of our images (and would require new releases from all the active lines)
>     Marton
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org

To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

View raw message