metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Merriman <merrim...@gmail.com>
Subject Re: [DISCUSS] Build Times are getting out of hand
Date Tue, 07 Feb 2017 19:52:13 GMT
Down to 24 minutes?  Nice job.

On Tue, Feb 7, 2017 at 1:49 PM, Casey Stella <cestella@gmail.com> wrote:

> I spent a minute or two looking at how we might use travis
> configuration-alone to drop the wall-clock time of the build and put it up
> for review at https://github.com/apache/incubator-metron/pull/444
>
> It does 2 things:
>
>    - Separates the build, the unit tests and the integration tests
>    - Parallelizes the unit tests and the build and runs the integration
>    tests within the travis container
>    - Runs the unit tests and integration tests in separate travis
>    containers using travis' build matrix
>
> This ultimately cuts the wallclock time down to 24 minutes for me on travis
> and should give us some time where we're not constantly bouncing builds to
> act on the suggestions here.
>
>
> On Tue, Feb 7, 2017 at 1:03 PM, Michael Miklavcic <
> michael.miklavcic@gmail.com> wrote:
>
> > FYI, found this for Docker - https://docs.travis-ci.com/user/docker/
> >
> > On Tue, Feb 7, 2017 at 9:09 AM, David Lyle <dlyle65535@gmail.com> wrote:
> >
> > > Absolutely agree. I also think we'd want both once we've done that.
> > Travis
> > > is good for smoke testing PRs and Commits. Jenkins is good for nightly
> > runs
> > > of medium duration tests and would be great for automating our
> > distributed
> > > testing if we found infrastructure to support it. I've seen them used
> in
> > > concert to provide a good solution.
> > >
> > > But, initially, I'd like to see us get our in-process stuff replaced
> with
> > > docker where (if) it makes sense, refactored to run in parallel, the
> poms
> > > refactored to handle our dependencies better and our uber jars removed
> > > where they can be and minimized where they cannot be.
> > >
> > > Which, I think, is a long-winded way of saying "I'd like to see us do
> > what
> > > Casey suggested." :)
> > >
> > > -D...
> > >
> > >
> > > On Tue, Feb 7, 2017 at 10:45 AM, Michael Miklavcic <
> > > michael.miklavcic@gmail.com> wrote:
> > >
> > > > I agree with this. I don't think we should switch to an alternate
> > system
> > > > until we find that we are absolutely incapable of eking out any
> further
> > > > efficiency from the current setup.
> > > >
> > > > On Tue, Feb 7, 2017 at 8:04 AM, Casey Stella <cestella@gmail.com>
> > wrote:
> > > >
> > > > > I believe that some people use travis and some people request
> Jenkins
> > > > from
> > > > > Apache Infra.  That being said, personally, I think we should take
> > the
> > > > > opportunity to correct the underlying issues.  50 minutes for a
> build
> > > > seems
> > > > > excessive to me.
> > > > >
> > > > > On Mon, Feb 6, 2017 at 10:07 PM, Otto Fowler <
> > ottobackwards@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Is there an alternative to Travis?  Do other like sized apache
> > > projects
> > > > > > have these problems?  Do they use travis?
> > > > > >
> > > > > >
> > > > > > On February 6, 2017 at 17:02:37, Casey Stella (
> cestella@gmail.com)
> > > > > wrote:
> > > > > >
> > > > > > For those with pending/building pull requests, it will come
as no
> > > > > surprise
> > > > > > that our build times are increasing at a pace that is worrisome.
> In
> > > > fact,
> > > > > > we have hit a fundamental limit associated with Travis over
the
> > > > weekend.
> > > > > > We have creeped up into the 40+ minute build territory and travis
> > > seems
> > > > > to
> > > > > > error out at around 49 minutes.
> > > > > >
> > > > > > Taking the current build (
> > > > > > https://travis-ci.org/apache/incubator-metron/jobs/198929446),
> > > looking
> > > > > at
> > > > > > just job times, we're spending about 19 - 20 minutes (1176.53
> > > seconds)
> > > > in
> > > > > > tests out of 44 minutes and 42 seconds to do the build. This
> places
> > > the
> > > > > > unit tests at around 43% of the build time. I say all of this
to
> > > point
> > > > > out
> > > > > > that while unit tests are a portion of the build, they are not
> even
> > > the
> > > > > > majority of the build time. We need an approach that addresses
> the
> > > > whole
> > > > > > build performance holistically and we need it soonest.
> > > > > >
> > > > > > To seed the discussion, I will point to a few things that come
to
> > > mind
> > > > > > that
> > > > > > fit into three broad categories:
> > > > > >
> > > > > > *Tests are Slow*
> > > > > >
> > > > > >
> > > > > > - *Tactical*: We have around 13 tests that take more than 30
> > seconds
> > > > and
> > > > > > make up 14 minutes of the build. Considering what we can do
to
> > speed
> > > > > those
> > > > > > tests as a tactical approach may be worth considering
> > > > > > - We are spinning up the same services (e.g. kafka, storm) for
> > > multiple
> > > > > > tests, instead use the docker infrastructure to spin them up
once
> > and
> > > > > then
> > > > > > use them throughout the tests.
> > > > > >
> > > > > >
> > > > > > *Tests aren't parallel*
> > > > > >
> > > > > > Currently we cannot run the build in parallel due to the
> > integration
> > > > test
> > > > > > infrastructure spinning up its own services that bind to the
same
> > > > ports.
> > > > > > If we correct this, we can run the builds in parallel with mvn
-T
> > > > > >
> > > > > > - Correct this by decoupling the infrastructure from the tests
> and
> > > > > > refactoring the tests to run in parallel.
> > > > > > - Make the integration testing infrastructure bind intelligently
> to
> > > > > > whatever port is available.
> > > > > > - Move the integration tests to their own project. This will
let
> us
> > > run
> > > > > > the build in parallel since an individual project's test will
be
> > run
> > > > > > serially.
> > > > > >
> > > > > > *Packaging is Painful*
> > > > > >
> > > > > > We have a sensitive environment in terms of dependencies. As
> such,
> > we
> > > > are
> > > > > > careful to shade and relocate dependencies that we want to
> isolate
> > > from
> > > > > > our
> > > > > > transitive dependencies. The consequences of this is that we
> spend
> > a
> > > > lot
> > > > > > of time in the build shading and relocating maven module output.
> > > > > >
> > > > > > - Do the hard work to walk our transitive dependencies and ensure
> > > that
> > > > > > we are including only one copy of every library by using
> exclusions
> > > > > > effectively. This will not only bring down build times, it will
> > make
> > > > sure
> > > > > > we know what we're including.
> > > > > > - Try to devise a strategy where we only shade once at the end.
> > This
> > > > > > could look like some combination of
> > > > > > - standardizing on the lowest common denominator of a troublesome
> > > > > > library
> > > > > > - We shade in dependencies so they can use different versions
of
> > > > > > libraries (e.g. metron-common with a modern version of guava)
> than
> > > the
> > > > > > final jars.
> > > > > > - exclusions
> > > > > > - externalizing infrastructure out to not necessitate spinning
up
> > > > > > hadoop components in-process for integration tests (i.e. hbase
> > server
> > > > > > conflicts with storm in a few dependencies)
> > > > > >
> > > > > > *Final Thoughts*
> > > > > >
> > > > > > If I had three to pick, I'd pick
> > > > > >
> > > > > > - moving off of the in-memory component infrastructure to docker
> > > images
> > > > > > - fixing the maven poms to exclude correctly
> > > > > > - ensuring the resulting tests are parallelizable
> > > > > >
> > > > > > I will point out that fixing the maven poms to exclude correctly
> > > (i.e.
> > > > we
> > > > > > choose the version of every jar that we depend on transitively)
> > ticks
> > > > > > multiple boxes, not just making things faster.
> > > > > >
> > > > > > What are your thoughts? What did I miss? We need a plan and
we
> need
> > > to
> > > > > > execute on it soon, otherwise travis is going to keep smacking
us
> > > hard.
> > > > > It
> > > > > > may be worth while constructing a tactical plan and then a more
> > > > strategic
> > > > > > plan that we can work toward. I was heartened at how much some
of
> > > these
> > > > > > suggestions dovetail with the discussion around the future of
the
> > > > docker
> > > > > > infrastructure.
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Casey
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message