hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@hortonworks.com>
Subject Re: [DISCUSS] Docker build process
Date Thu, 21 Mar 2019 22:33:10 GMT
The flexibility of date appended release number is equivalent to maven snapshot or Docker latest
image convention, machine can apply timestamp better than human.  By using the Jenkins release
process, this can be done with little effort.  For official release, it is best to use Docker
image digest id to ensure uniqueness.  E.g.

FROM centos@sha256:67dad89757a55bfdfabec8abd0e22f8c7c12a1856514726470228063ed86593b 

Developer downloaded released source would build with the same docker image without getting
side effects.  

A couple years ago, RedHat has decided to fix SSL vulnerability in RedHat 6/7 by adding extra
parameter to disable certification validation in urllib2 python library and force certificate
signer validation on by default.  It completely broke Ambari agent and its self-signed certificate.
 Customers had to backtrack to pick up a specific version of python SSL library to keep their
production cluster operational.  Without doing the due-diligence of certify Hadoop code and
the OS image, there is wriggle room for errors.  OS update example is a perfect example that
we want the container OS image certified with Hadoop binary release to avoid the wriggle rooms.
 Snapshot release is ok to have wriggle room for developers, but I don't think that flexibility
is necessary for official release.


´╗┐On 3/21/19, 2:44 PM, "Elek, Marton" <elek@apache.org> wrote:

    > If versioning is done correctly, older branches can have the same docker subproject,
and Hadoop 2.7.8 can be released for older Hadoop branches.  We don't generate timeline paradox
to allow changing the history of Hadoop 2.7.1.  That release has passed and let it stay that
    I understand your point but I am afraid that my concerns were not
    expressed clearly enough (sorry for that).
    Let's say that we use centos as the base image. In case of a security
    problem on the centos side (eg. in libssl) or jdk side, I would rebuild
    all the hadoop:2.x / hadoop:3.x images and republish them. Exactly the
    same hadoop bytes but updated centos/jdk libraries.
    I understand your concerns that in this case the an image with the same
    tag (eg. hadoop:3.2.1) will be changed over the time. But this can be
    solved by adding date specific postfixes (eg. hadoop:3.2.1-20190321 tag
    would never change but hadoop:3.2.1 can be changed)
    I know that it's not perfect, but this is widely used. For example the
    centos:7 tag is not fixed but centos:7.6.1810 is (hopefully).
    Without this flexibility any centos/jdk security issue can invalidate
    all of our images (and would require new releases from all the active lines)

View raw message