tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Eagles <jeag...@gmail.com>
Subject Re: Handling ATS downtime
Date Thu, 22 Jan 2015 00:03:08 GMT
I just checked this behavior in a secure cluster and if it fails to get a
timeline server delegation token or fails to post the domain,  the job will
fail. We should consider making these operations "best effort" as well.
On Jan 21, 2015 5:33 PM, "Hitesh Shah" <hitesh@apache.org> wrote:

> Actually at this time, the current impl just logs a WARN when there is a
> failure pushing data to ATS. ATS is not treated as a critical entity as it
> is not needed for job recovery.
>
> — Hitesh
>
> On Jan 21, 2015, at 3:01 PM, Rohini Palaniswamy <rohini.aditya@gmail.com>
> wrote:
>
> > Folks,
> >     In the middle of big discussion on how to get delegation tokens from
> > ATS for Oozie jobs, another question came up. What is the behaviour of
> > running tez jobs if ATS goes down. Haven't tried it out, but my guess is
> > the job is going to fail. Or do we do something now to handle the failure
> > and still have the job complete successfully?
> >
> > Regards,
> > Rohini
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message