airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Huang <daniel.hu...@upsight.com>
Subject Re: [RESULT] [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc4
Date Sun, 26 Feb 2017 07:12:13 GMT
I think the skipped task issue mentioned is the same issue described in
AIRFLOW-872 <https://issues.apache.org/jira/browse/AIRFLOW-872>. I had a
DAG that was consistently hitting it. I believe the skipped task has to
have at least one downstream task for this to occur (e.g.
LatestOnlyOperator >> DummyOp1 >> DummyOp2), which may also explain why it
doesn't affect the tests.

On Sat, Feb 25, 2017 at 9:35 AM, Alex Van Boxel <alex@vanboxel.be> wrote:

> I think what I observed was doesn't every time otherwise we would see it
> every time. I'll see if it happens this night again.
>
> On Sat, Feb 25, 2017 at 1:24 PM Jeremiah Lowin <jlowin@apache.org> wrote:
>
> > Interesting if this is related to what I was seeing -- but to be clear
> the
> > error I observed is non-deterministic and doesn't happen every time
> > (obviously, because otherwise there would be no passing Travis runs). Is
> > that the case for what you're describing, Dan/Alex?
> >
> > On Sat, Feb 25, 2017 at 4:13 AM Alex Van Boxel <alex@vanboxel.be> wrote:
> >
> > About:  Skipped tasks potentially cause a dagrun to be marked
> > failure/success prematurely. Isn't that related to the discussion I had
> > with Max about the ONE_SUCCESS trigger? When skipping tasks for now you
> > need to put ONE_SUCCESS. I had kind of a fix but it was rejected because
> it
> > changed behaviour.
> >
> > On Sat, Feb 25, 2017 at 9:19 AM Bolke de Bruin <bdbruin@gmail.com>
> wrote:
> >
> > > Not trying to muddy the waters, but the observation of Jeremiah (non
> > > deterministic outcomes) might have to do something with #3. I didn’t
> dive
> > > in deeper, yet.
> > >
> > > ======================================================================
> > > ERROR: test_backfill_examples (tests.BackfillJobTest)
> > > ----------------------------------------------------------------------
> > > Traceback (most recent call last):
> > >   File "/home/travis/build/apache/incubator-airflow/tests/jobs.py",
> line
> > > 164, in test_backfill_examples
> > >     job.run()
> > >   File "/home/travis/build/apache/incubator-airflow/airflow/jobs.py",
> > line
> > > 200, in run
> > >     self._execute()
> > >   File "/home/travis/build/apache/incubator-airflow/airflow/jobs.py",
> > line
> > > 1999, in _execute
> > >     raise AirflowException(err)
> > > AirflowException: ---------------------------------------------------
> > > Some task instances failed:
> > > set([('example_short_circuit_operator', 'condition_is_True',
> > > datetime.datetime(2016, 1, 1, 0, 0))])
> > > https://s3.amazonaws.com/archive.travis-ci.org/jobs/204780706/log.txt
> <
> > > https://s3.amazonaws.com/archive.travis-ci.org/jobs/204780706/log.txt>
> > >
> > > Bolke
> > >
> > > > On 25 Feb 2017, at 09:07, Bolke de Bruin <bdbruin@gmail.com> wrote:
> > > >
> > > > Hi Dan,
> > > >
> > > > - Backfill indeed runs only one dagrun at the time, see line 1755 of
> > > jobs.py. I’ll think about how to fix this over the weekend (I think it
> > was
> > > my change that introduced this). Suggestions always welcome. Depending
> > the
> > > impact it is a blocker or not. We don’t often use backfills and
> > definitely
> > > not at your size, so that is why it didn’t pop up with us. I’m assuming
> > > blocker for now, btw.
> > > > - Speculation on the High DB Load. I’m not sure what your benchmark
> is
> > > here (1.7.1 + multi processor dags?), but as you mentioned in the code
> > > dependencies are checked a couple of times for one run and even task
> > > instance. Dependency checking requires aggregation on the DB, which is
> a
> > > performance killer. Annoying but not a blocker.
> > > > - Skipped tasks potentially cause a dagrun to be marked
> failure/success
> > > prematurely. BranchOperators are widely used if it affects these
> > operators,
> > > then it is a blocker.
> > > >
> > > > - Bolke
> > > >
> > > >> On 25 Feb 2017, at 02:04, Dan Davydov <dan.davydov@airbnb.com
> > .INVALID>
> > > wrote:
> > > >>
> > > >> Update on old pending issues:
> > > >> - Black Squares in UI: Fix merged
> > > >> - Double Trigger Issue That Alex G Mentioned: Alex has a PR in
> flight
> > > >>
> > > >> New Issues:
> > > >> - Backfill seems to be having issues (only running one dagrun at a
> > > time),
> > > >> we are still investigating - might be a blocker
> > > >> - High DB Load (~8x more than 1.7) - We are still investigating but
> > it's
> > > >> probably not a blocker for the release
> > > >> - Skipped tasks potentially cause a dagrun to be marked as
> > > failure/success
> > > >> prematurely - not sure whether or not to classify this as a blocker
> > > (only
> > > >> really an issue for users who use the BranchingPythonOperator, which
> > > AirBnB
> > > >> does)
> > > >>
> > > >> On Thu, Feb 23, 2017 at 5:59 PM, siddharth anand <sanand@apache.org
> >
> > > wrote:
> > > >>
> > > >>> IMHO, a DAG run without a start date is non-sensical but is not
> > > enforced
> > > >>> That said, our UI allows for the manual creation of DAG Runs
> without
> > a
> > > >>> start date as shown in the images below:
> > > >>>
> > > >>>
> > > >>>  - https://www.dropbox.com/s/3sxcqh04eztpl7p/Screenshot%
> > > >>>  202017-02-22%2016.00.40.png?dl=0
> > > >>>  <https://www.dropbox.com/s/3sxcqh04eztpl7p/Screenshot%
> > > >>> 202017-02-22%2016.00.40.png?dl=0>
> > > >>>  - https://www.dropbox.com/s/4q6rr9dwghag1yy/Screenshot%
> > > >>>  202017-02-22%2016.02.22.png?dl=0
> > > >>>  <https://www.dropbox.com/s/4q6rr9dwghag1yy/Screenshot%
> > > >>> 202017-02-22%2016.02.22.png?dl=0>
> > > >>>
> > > >>>
> > > >>> On Wed, Feb 22, 2017 at 2:26 PM, Maxime Beauchemin <
> > > >>> maximebeauchemin@gmail.com> wrote:
> > > >>>
> > > >>>> Our database may have edge cases that could be associated
with
> > running
> > > >>> any
> > > >>>> previous version that may or may not have been part of an
official
> > > >>> release.
> > > >>>>
> > > >>>> Let's see if anyone else reports the issue. If no one does,
one
> > > option is
> > > >>>> to release 1.8.0 as is with a comment in the release notes,
and
> have
> > a
> > > >>>> future official minor apache release 1.8.1 that would fix
these
> > minor
> > > >>>> issues that are not deal breaker.
> > > >>>>
> > > >>>> @bolke, I'm curious, how long does it take you to go through
one
> > > release
> > > >>>> cycle? Oh, and do you have a documented step by step process
for
> > > >>> releasing?
> > > >>>> I'd like to add the Pypi part to this doc and add committers
that
> > are
> > > >>>> interested to have rights on the project on Pypi.
> > > >>>>
> > > >>>> Max
> > > >>>>
> > > >>>> On Wed, Feb 22, 2017 at 2:00 PM, Bolke de Bruin <
> bdbruin@gmail.com>
> > > >>> wrote:
> > > >>>>
> > > >>>>> So it is a database integrity issue? Afaik a start_date
should
> > always
> > > >>> be
> > > >>>>> set for a DagRun (create_dagrun) does so  I didn't check
the code
> > > >>> though.
> > > >>>>>
> > > >>>>> Sent from my iPhone
> > > >>>>>
> > > >>>>>> On 22 Feb 2017, at 22:19, Dan Davydov <dan.davydov@airbnb.com.
> > > >>> INVALID>
> > > >>>>> wrote:
> > > >>>>>>
> > > >>>>>> Should clarify this occurs when a dagrun does not
have a start
> > date,
> > > >>>> not
> > > >>>>> a
> > > >>>>>> dag (which makes it even less likely to happen). I
don't think
> > this
> > > >>> is
> > > >>>> a
> > > >>>>>> blocker for releasing.
> > > >>>>>>
> > > >>>>>>> On Wed, Feb 22, 2017 at 1:15 PM, Dan Davydov <
> > > >>> dan.davydov@airbnb.com>
> > > >>>>> wrote:
> > > >>>>>>>
> > > >>>>>>> I rolled this out in our prod and the webservers
failed to load
> > due
> > > >>> to
> > > >>>>>>> this commit:
> > > >>>>>>>
> > > >>>>>>> [AIRFLOW-510] Filter Paused Dags, show Last Run
& Trigger Dag
> > > >>>>>>> 7c94d81c390881643f94d5e3d7d6fb351a445b72
> > > >>>>>>>
> > > >>>>>>> This fixed it:
> > > >>>>>>> -                            </a> <span
id="statuses_info"
> > > >>>>>>> class="glyphicon glyphicon-info-sign" aria-hidden="true"
> > > >>> title="Start
> > > >>>>> Date:
> > > >>>>>>> {{last_run.start_date.strftime('%Y-%m-%d %H:%M')}}"></span>
> > > >>>>>>> +                            </a> <span
id="statuses_info"
> > > >>>>>>> class="glyphicon glyphicon-info-sign"
> aria-hidden="true"></span>
> > > >>>>>>>
> > > >>>>>>> This is caused by assuming that all DAGs have
start dates set,
> so
> > a
> > > >>>>> broken
> > > >>>>>>> DAG will take down the whole UI. Not sure if we
want to make
> this
> > a
> > > >>>>> blocker
> > > >>>>>>> for the release or not, I'm guessing for most
deployments this
> > > would
> > > >>>>> occur
> > > >>>>>>> pretty rarely. I'll submit a PR to fix it soon.
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Tue, Feb 21, 2017 at 9:49 AM, Chris Riccomini
<
> > > >>>> criccomini@apache.org
> > > >>>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Ack that the vote has already passed, but
belated +1 (binding)
> > > >>>>>>>>
> > > >>>>>>>> On Tue, Feb 21, 2017 at 7:42 AM, Bolke de
Bruin <
> > > bdbruin@gmail.com
> > > >>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> IPMC Voting can be found here:
> > > >>>>>>>>>
> > > >>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-general/
> > > >>>>>>>> 201702.mbox/%
> > > >>>>>>>>> 3c676BDC9F-1B55-4469-92A7-9FF309AD0EC8@gmail.com%3e
<
> > > >>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-general/
> > > >>>>>>>> 201702.mbox/%
> > > >>>>>>>>> 3C676BDC9F-1B55-4469-92A7-9FF309AD0EC8@gmail.com%3E>
> > > >>>>>>>>>
> > > >>>>>>>>> Kind regards,
> > > >>>>>>>>> Bolke
> > > >>>>>>>>>
> > > >>>>>>>>>> On 21 Feb 2017, at 08:20, Bolke de
Bruin <bdbruin@gmail.com
> >
> > > >>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Hello,
> > > >>>>>>>>>>
> > > >>>>>>>>>> Apache Airflow (incubating) 1.8.0
(based on RC4) has been
> > > >>> accepted.
> > > >>>>>>>>>>
> > > >>>>>>>>>> 9 “+1” votes received:
> > > >>>>>>>>>>
> > > >>>>>>>>>> - Maxime Beauchemin (binding)
> > > >>>>>>>>>> - Arthur Wiedmer (binding)
> > > >>>>>>>>>> - Dan Davydov (binding)
> > > >>>>>>>>>> - Jeremiah Lowin (binding)
> > > >>>>>>>>>> - Siddharth Anand (binding)
> > > >>>>>>>>>> - Alex van Boxel (binding)
> > > >>>>>>>>>> - Bolke de Bruin (binding)
> > > >>>>>>>>>>
> > > >>>>>>>>>> - Jayesh Senjaliya (non-binding)
> > > >>>>>>>>>> - Yi (non-binding)
> > > >>>>>>>>>>
> > > >>>>>>>>>> Vote thread (start):
> > > >>>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-
> > > >>>>>>>>> airflow-dev/201702.mbox/%3cD360D9BE-C358-42A1-9188-
> > > >>>>>>>>> 6C92C31A2F8B@gmail.com%3e <http://mail-archives.apache.
> > > >>>>>>>>> org/mod_mbox/incubator-airflow-dev/201702.mbox/%3C7EB7B6D6-
> > > >>>>>>>> 092E-48D2-AA0F-
> > > >>>>>>>>> 15F44376A8FF@gmail.com%3E>
> > > >>>>>>>>>>
> > > >>>>>>>>>> Next steps:
> > > >>>>>>>>>> 1) will start the voting process at
the IPMC mailinglist. I
> do
> > > >>>> expect
> > > >>>>>>>>> some changes to be required mostly in
documentation maybe a
> > > >>> license
> > > >>>>> here
> > > >>>>>>>>> and there. So, we might end up with changes
to stable. As
> long
> > as
> > > >>>>> these
> > > >>>>>>>> are
> > > >>>>>>>>> not (significant) code changes I will
not re-raise the vote.
> > > >>>>>>>>>> 2) Only after the positive voting
on the IPMC and
> finalisation
> > I
> > > >>>> will
> > > >>>>>>>>> rebrand the RC to Release.
> > > >>>>>>>>>> 3) I will upload it to the incubator
release page, then the
> > tar
> > > >>>> ball
> > > >>>>>>>>> needs to propagate to the mirrors.
> > > >>>>>>>>>> 4) Update the website (can someone
volunteer please?)
> > > >>>>>>>>>> 5) Finally, I will ask Maxime to upload
it to pypi. It seems
> > we
> > > >>> can
> > > >>>>>>>> keep
> > > >>>>>>>>> the apache branding as lib cloud is doing
this as well (
> > > >>>>>>>>> https://libcloud.apache.org/downloads.html#pypi-package
<
> > > >>>>>>>>> https://libcloud.apache.org/downloads.html#pypi-package>).
> > > >>>>>>>>>>
> > > >>>>>>>>>> Jippie!
> > > >>>>>>>>>>
> > > >>>>>>>>>> Bolke
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >
> > >
> > > --
> >   _/
> > _/ Alex Van Boxel
> >
> --
>   _/
> _/ Alex Van Boxel
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message