metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Miklavcic <michael.miklav...@gmail.com>
Subject Re: [DISCUSS] Build Times are getting out of hand
Date Tue, 07 Feb 2017 14:26:15 GMT
I can't recall, did we have a good solution around Docker and remote
debugging integration tests from the IDE? On the topic of test refactoring
and running in parallel, I'm all for it. I know JJ had been doing this on
his local machine at one point, but we'd need to be sure all tests are
truly independent. E.g. counts on hbase tables would need to be very
specific or every test should use unique tables. Also, can we spin up
something like Docker in Travis? How many cores do we get? I'll look into
that and see what we get.

I'm all for simplifying our dependencies. Shading the jars takes an
incredible amount of time and has consistently bitten us repeatedly.
Another bummer about the jar shading has been that the build runs
differently in IntelliJ than it does from the Maven command line. I don't
think we'll get away from it entirely, but we may be able to make this
better as well.

>From my most recent local build, these are the biggest offending modules:
metron-profiler .................................... SUCCESS [05:56 min]
metron-parsers ..................................... SUCCESS [09:38 min]
metron-data-management ............................. SUCCESS [09:15 min]
elasticsearch-shaded ............................... SUCCESS [08:05 min]

I'm going to take a look at Travis and also see what pom dependencies I can
start excluding.


On Mon, Feb 6, 2017 at 3:02 PM, Casey Stella <cestella@gmail.com> wrote:

> For those with pending/building pull requests, it will come as no surprise
> that our build times are increasing at a pace that is worrisome.  In fact,
> we have hit a fundamental limit associated with Travis over the weekend.
> We have creeped up into the 40+ minute build territory and travis seems to
> error out at around 49 minutes.
>
> Taking the current build (
> https://travis-ci.org/apache/incubator-metron/jobs/198929446), looking at
> just job times, we're spending about 19 - 20 minutes (1176.53 seconds) in
> tests out of 44 minutes and 42 seconds to do the build.  This places the
> unit tests at around 43% of the build time.  I say all of this to point out
> that while unit tests are a portion of the build, they are not even the
> majority of the build time.  We need an approach that addresses the whole
> build performance holistically and we need it soonest.
>
> To seed the discussion, I will point to a few things that come to mind that
> fit into three broad categories:
>
> *Tests are Slow*
>
>
>    - *Tactical*: We have around 13 tests that take more than 30 seconds and
>    make up 14 minutes of the build.  Considering what we can do to speed
> those
>    tests as a tactical approach may be worth considering
>    - We are spinning up the same services (e.g. kafka, storm) for multiple
>    tests, instead use the docker infrastructure to spin them up once and
> then
>    use them throughout the tests.
>
>
> *Tests aren't parallel*
>
> Currently we cannot run the build in parallel due to the integration test
> infrastructure spinning up its own services that bind to the same ports.
> If we correct this, we can run the builds in parallel with mvn -T
>
>    - Correct this by decoupling the infrastructure from the tests and
>    refactoring the tests to run in parallel.
>    - Make the integration testing infrastructure bind intelligently to
>    whatever port is available.
>    - Move the integration tests to their own project.  This will let us run
>    the build in parallel since an individual project's test will be run
>    serially.
>
> *Packaging is Painful*
>
> We have a sensitive environment in terms of dependencies.  As such, we are
> careful to shade and relocate dependencies that we want to isolate from our
> transitive dependencies.  The consequences of this is that we spend a lot
> of time in the build shading and relocating maven module output.
>
>    - Do the hard work to walk our transitive dependencies and ensure that
>    we are including only one copy of every library by using exclusions
>    effectively.  This will not only bring down build times, it will make
> sure
>    we know what we're including.
>    - Try to devise a strategy where we only shade once at the end.  This
>    could look like some combination of
>       - standardizing on the lowest common denominator of a troublesome
>       library
>          - We shade in dependencies so they can use different versions of
>          libraries (e.g. metron-common with a modern version of guava)
> than the
>          final jars.
>       - exclusions
>       - externalizing infrastructure out to not necessitate spinning up
>       hadoop components in-process for integration tests (i.e. hbase server
>       conflicts with storm in a few dependencies)
>
> *Final Thoughts*
>
> If I had three to pick, I'd pick
>
>    - moving off of the in-memory component infrastructure to docker images
>    - fixing the maven poms to exclude correctly
>    - ensuring the resulting tests are parallelizable
>
> I will point out that fixing the maven poms to exclude correctly (i.e. we
> choose the version of every jar that we depend on transitively) ticks
> multiple boxes, not just making things faster.
>
> What are your thoughts?  What did I miss?  We need a plan and we need to
> execute on it soon, otherwise travis is going to keep smacking us hard.  It
> may be worth while constructing a tactical plan and then a more strategic
> plan that we can work toward.  I was heartened at how much some of these
> suggestions dovetail with the discussion around the future of the docker
> infrastructure.
>
> Best,
>
> Casey
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message