spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shane knapp ☠ <skn...@berkeley.edu>
Subject [build system] IMPORTANT UPDATE
Date Tue, 24 Nov 2020 19:24:38 GMT
this is a lengthy, but important read for everyone here.

in the next few days, the remaining centos machines (PRB/SBT workers AND
primary) will have be reimaged from centos6.9 to ubuntu 20.04LTS.

this means three important things on the very near horizon:
1 -- the PRB and SBT tests WILL BE BROKEN (by thanksgiving)
2 -- jenkins itself will be down for a while as we move the jenkins
installation to it's new home.
3 -- those of you with accounts here will temporarily lose access

regarding (1), brian (cced) will be helping me debug and fix any
system-level bugs (python envs, missing packages, etc).  jon (cced) will be
doing the reimaging and cobbling together of hardware to keep us on our
feet.  their help is going to be invaluable to getting us back on the
ground.

we already have two ubuntu 20 workers up and building
(research-jenkins-worker-0[1,2]), and the SparkPullRequestBuilder-K8s build
is already green.  i'll keep an eye on these workers to ensure i didn't
miss anything.

once we have a couple of more ubuntu 20 machines up, i'll move the PRB and
SBT builds there and let them fail as often as possible so we can use the
build logs during the migration of the primary.

then we shut down jenkins and move to the new primary.

this will all be happening in the next week to week-and-a-half.

nearish on the horizon, we need to do two things:
1 -- reimage the ubuntu 16 workers
2 -- clean up the all of the breakages within jenkins plugin universe.
there's a lot of stacktraces everywhere after the upgrade, but things are
still building so i'm inclined to push this out.
3 -- fix the PRB/SBT builds.

further off, once we're stable, we (the spark community) will need to have
an honest conversation about where the build system lives.  we don't
currently have enough resources here to manage the system in a way that it
deserves, and i can't forsee getting the staffing for long-term support any
time soon.

however, with the ansible configs (which i plan on moving to the spark
repo), it should be much easier to replicate the build system.

by this time next year, i would like to have helped find the build system a
new home, and sunset jenkins.  over the past 11 years (i think), this
system has built spark.  it's getting a little tired and needs a well
deserved break.  :)

shane
-- 
Shane Knapp
Computer Guy / Voice of Reason
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Mime
View raw message