hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@hortonworks.com>
Subject [DISCUSS] Docker build process
Date Wed, 13 Mar 2019 22:24:11 GMT
Hi Hadoop developers,

In the recent months, there were various discussions on creating docker build process for
Hadoop.  There was convergence to make docker build process inline in the mailing list last
month when Ozone team is planning new repository for Hadoop/ozone docker images.  New feature
has started to add docker image build process inline in Hadoop build.
A few lessons learnt from making docker build inline in YARN-7129.  The build environment
must have docker to have a successful docker build.  BUILD.txt stated for easy build environment
use Docker.  There is logic in place to ensure that absence of docker does not trigger docker
build.  The inline process tries to be as non-disruptive as possible to existing development
environment with one exception.  If docker’s presence is detected, but user does not have
rights to run docker.  This will cause the build to fail.

Now, some developers are pushing back on inline docker build process because existing environment
did not make docker build process mandatory.  However, there are benefits to use inline docker
build process.  The listed benefits are:

1.  Source code tag, maven repository artifacts and docker hub artifacts can all be produced
in one build.
2.  Less manual labor to tag different source branches.
3.  Reduce intermediate build caches that may exist in multi-stage builds.
4.  Release engineers and developers do not need to search a maze of build flags to acquire

The disadvantages are:

1.  Require developer to have access to docker.
2.  Default build takes longer.

There is workaround for above disadvantages by using -DskipDocker flag to avoid docker build
completely or -pl !modulename to bypass subprojects.
Hadoop development did not follow Maven best practice because a full Hadoop build requires
a number of profile and configuration parameters.  Some evolutions are working against Maven
design and require fork of separate source trees for different subprojects and pom files.
 Maven best practice (https://dzone.com/articles/maven-profile-best-practices) has explained
that do not use profile to trigger different artifact builds because it will introduce maven
artifact naming conflicts on maven repository using this pattern.  Maven offers flags to skip
certain operations, such as -DskipTests -Dmaven.javadoc.skip=true -pl or -DskipDocker.  It
seems worthwhile to make some corrections to follow best practice for Hadoop build.

Some developers have advocated for separate build process for docker images.  We need consensus
on the direction that will work best for Hadoop development community.  Hence, my questions

Do we want to have inline docker build process in maven?
If yes, it would be developer’s responsibility to pass -DskipDocker flag to skip docker.
 Docker is mandatory for default build.
If no, what is the release flow for docker images going to look like?

Thank you for your feedback.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message