nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Witt <joe.w...@gmail.com>
Subject Re: [DISCUSS] CI / Travis / Jenkins
Date Thu, 07 Dec 2017 02:38:47 GMT
Team,

Ok so finally some really solid news to share on the Travis-CI front.
First, huge thanks to Aldrin for getting this started and folks like
Andre and Pierre who have tweaked it to make it more usable as well.
After a long run of it helping us out as we all know it went poorly
with every build failing for what seemed like months.

After some improvements and updates to our usage of maven which now
means parallel builds with contrib check seem to be working and after
going ruthless mode on hunting down unstable tests and either fixing
them or making them integration-tests the build is far more stable.
We all need to try and stay on top of that.  Today though i realized
that our builds were happening twice and that appeared to be why it
took roughly 50 minutes to finish, at best, and we'd timeout and fail.
  So after adjusting our travis.yml we now only build once and the
process takes about 25 mins so we're well within.

Latest build on travis-ci: https://travis-ci.org/apache/nifi/builds/312629807
Appveyor builds:
https://ci.appveyor.com/project/ApacheSoftwareFoundation/nifi/build/1.0.0-SNAPSHOT-6649

So we're heading in the right direction.  If it stays stable perhaps
we could add openjdk builds as well.

THanks
Joe

On Tue, Dec 5, 2017 at 4:11 PM, Joe Witt <joe.witt@gmail.com> wrote:
> OK well things are looking pretty good.  The only obvious problem now
> is that our builds take about 45-50 mins on travis-ci.org and the
> build time limit is 50 mins [1] so some jobs get killed.
>
> Will look at areas we can avoid spending build time on at least in
> travis-ci land.  Probably no great option but let's see.
>
> [1] https://docs.travis-ci.com/user/customizing-the-build#Build-Timeouts
>
> On Tue, Dec 5, 2017 at 2:56 PM, Joe Witt <joe.witt@gmail.com> wrote:
>> Will try it out for PR https://github.com/apache/nifi/pull/2319 which
>> is being built under
>> https://travis-ci.org/apache/nifi/builds/312043710
>>
>> On Tue, Dec 5, 2017 at 2:51 PM, Joe Witt <joe.witt@gmail.com> wrote:
>>> Andre
>>>
>>> Thanks - read through https://issues.apache.org/jira/browse/NIFI-1657
>>> where this was discussed and where the relevant multi-env commit came
>>> in.
>>>
>>> Seems like five environments may be too taxing based on the build
>>> failures I'm observing.  I'll cut it down to three
>>> FR
>>> JP
>>> US
>>> For now.  We can evaluate if that helps at all and add more back if
>>> things become stable.
>>>
>>> Thanks
>>> Joe
>>>
>>> On Tue, Dec 5, 2017 at 12:20 AM, Andre <andre-lists@fucs.org> wrote:
>>>> Joe,
>>>>
>>>> Glad to help! Few notes:
>>>>
>>>> If I recall correctly there was a reason we chose to add default and BR but
>>>> to be honest I can't really remember what it was. I think it has to do with
>>>> Time Zones + Locale issues and has helped detecting bizarre issues on time
>>>> based junits (Matt B and Pierre may remember this).
>>>>
>>>> Regarding the rat check. The idea behind that was a fast failure in case
of
>>>> basic style violations, rather than wait until the end of the compilation.
>>>> To be honest I don't know if this has worked as desired but should allow
us
>>>> to quickly identify validation errors which if I recall correctly were only
>>>> detected at the end of contrib-check.
>>>>
>>>> And apologies for the anecdotal comments. I am away from my dev environment
>>>> atm so I can't truly validate them.
>>>>
>>>>
>>>> Kind regards
>>>>
>>>>
>>>> On Tue, Dec 5, 2017 at 3:31 PM, Joe Witt <joe.witt@gmail.com> wrote:
>>>>
>>>>> Great news!  So for the first time in a long time we now have
>>>>> travis-ci builds passing!
>>>>>
>>>>> I incorporated Dustin's PR which changed to the -Ddir-only instead of
>>>>> -P, added Andre's idea of dropping the -quiet flag, and dropped the
>>>>> number of builds in the config to a single parallel build with contrib
>>>>> check now that we're seeing those pass with rat/checkstyle.
>>>>>
>>>>> https://travis-ci.org/apache/nifi/builds/311660398
>>>>>
>>>>> A couple failed due to test failures and I filed JIRAs to convert
>>>>> these into integration tests or resolve.
>>>>>  -https://issues.apache.org/jira/browse/NIFI-4660,
>>>>> https://issues.apache.org/jira/browse/NIFI-4659
>>>>>
>>>>> One actually finished as you can see in its raw log but travis seems
>>>>> to have gotten confused.
>>>>>
>>>>> Two passed completely.  I think to reduce strain on Travis-CI
>>>>> infrastructure we should drop two of the environments.
>>>>>
>>>>> Current it is in .travis.yml
>>>>>
>>>>> env:
>>>>>   - USER_LANGUAGE=en USER_REGION=US'
>>>>>   - USER_LANGUAGE=fr USER_REGION=FR'
>>>>>   - USER_LANGUAGE=ja USER_REGION=JP'
>>>>>   - USER_LANGUAGE=pt USER_REGION=BR'
>>>>>   - USER_LANGUAGE=default USER_REGION=default
>>>>>
>>>>> I think we should drop it to
>>>>>
>>>>> env:
>>>>>   - USER_LANGUAGE=en USER_REGION=US'
>>>>>   - USER_LANGUAGE=fr USER_REGION=FR'
>>>>>   - USER_LANGUAGE=ja USER_REGION=JP'
>>>>>
>>>>> If no objections i'll do that soon.  But, good news is the builds are
>>>>> coming back to life on Travis-CI and will help streamline review
>>>>> cycles again!
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Mon, Dec 4, 2017 at 8:29 PM, Joe Witt <joe.witt@gmail.com> wrote:
>>>>> > nope. will take a look at this tonight though.
>>>>> >
>>>>> > On Dec 4, 2017 8:09 PM, "Andre" <andre-lists@fucs.org> wrote:
>>>>> >>
>>>>> >> Joe & Joey,
>>>>> >>
>>>>> >> I believe setting the maven compilation job to noisy - instead
of the
>>>>> >> current quiet setting - should help solving the issue.
>>>>> >>
>>>>> >> Have we tried that?
>>>>> >>
>>>>> >> Cheers
>>>>> >>
>>>>> >>
>>>>> >> On 5 Dec 2017 6:26 AM, "Joe Witt" <joe.witt@gmail.com>
wrote:
>>>>> >>
>>>>> >> I agree this would be extremely nice to get back on track. 
The
>>>>> >> changes made last night/today to the poms do appear to mean
that
>>>>> >> parallel builds with contrib-check are working.  Perhaps that
helps us
>>>>> >> a little with travis (or not).  I have reviewed a couple PRs
though
>>>>> >> recently that did not even compile much less have clean contrib-checks
>>>>> >> so it is really nice to have Travis being more reliable.  Does
anyone
>>>>> >> have any sense of the current reasons for issues?  When I've
looked
>>>>> >> the errors made no sense at all.
>>>>> >>
>>>>> >> On Mon, Dec 4, 2017 at 2:21 PM, Joey Frazee <joey.frazee@icloud.com>
>>>>> >> wrote:
>>>>> >> > I’m sure everyone has noticed that Travis CI fails, incorrectly,
more
>>>>> >> than it succeeds, often due to timeouts and not b/c of the incorrectness
>>>>> >> of
>>>>> >> a commit or PR.
>>>>> >> >
>>>>> >> > This has been discussed previously, but it’s carried
on, and become a
>>>>> >> > low
>>>>> >> information signal about the PRs, which has two big impacts:
(1) it’s
>>>>> >> ignored by experienced contributors and reviewers, and (2) it’s
>>>>> confusing
>>>>> >> or misleading to new contributors.
>>>>> >> >
>>>>> >> > So, we really need to find a solution. I can think of a
few:
>>>>> >> >
>>>>> >> > 1. Continue to push on INFRA to setup Jenkins for NiFi
and its
>>>>> >> sub-projects.
>>>>> >> >
>>>>> >> > 2. Implement some kind of quick-test profile and shell
script that
>>>>> >> > checks
>>>>> >> the most important things along with the subdirectories affected
by the
>>>>> >> PR,
>>>>> >> and continue to use Travis CI.
>>>>> >> >
>>>>> >> > 3. Use some other service like Circle CI or Codeship, which
probably
>>>>> >> isn’t quite what ASF wants but it might make the CI more useful
(it also
>>>>> >> might not).
>>>>> >> >
>>>>> >> > 4. Find a sponsor to support a more premium tier of Travis
CI (or
>>>>> >> > equiv.)
>>>>> >> so the build has enough resources to to succeed. This too probably
isn’t
>>>>> >> preferable but I’m sure we can find a precedent.
>>>>> >> >
>>>>> >> > I’m partial to pursuing (1) and (2) together because
(1) would give
>>>>> us a
>>>>> >> long term solution and (2) would have some value for local builds
(no
>>>>> need
>>>>> >> to run the full build) as well as making Travis CI tell us something.
>>>>> The
>>>>> >> first should be pretty low effort. The second will be labor
intensive I
>>>>> >> think — to identify what counts as quick and change the poms
— so it
>>>>> can’t
>>>>> >> be the answer on its own unless we want to wait longer to see
Travis CI
>>>>> >> become informative.
>>>>> >> >
>>>>> >> > What do the rest of you think?
>>>>> >> >
>>>>> >> > -joey
>>>>>

Mime
View raw message