hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13035) Add states INITING and STARTING to YARN Service model to cover in-transition states.
Date Sun, 01 May 2016 17:23:12 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265839#comment-15265839

Steve Loughran commented on HADOOP-13035:

-1, as it

This is a pretty fundamental change. I would have also really liked to have been pinged on
this earlier, given my hands are all over the code as it stands. While I acknowledge it isn't
perfect, it does include experience on other systems, and I did go through every single YARN
service, repeatedly, until things were stable.

This whole discrepancy between state-> starting and service->live is a recurrent problem,
but as you can see from things like web and IPC servers starting in the background, service
start() is inherently async; what code really needs to wait upon is not the state change complete,
but to await for the started state to go live, which *may happen at some indeterminate state
in the future*

Without picking into this patch in detail, here are the places which have caused most trouble
over time, which any patch at what is a fundamental bit of how the YARN services are constructed
is going to have to look at

* subclasses of {{CompositeService}} adding new services in service start, having to push
them through their lifecycle enough to attach them to their parent, then rely on the remaining
of the serviceStart lifecycle to walk themselves through.
* things going wrong in composite start and having to unroll the stack
* things trying to call stop() during start.
* the fact that calling start() on a service which is started *or in the process of starting*
is required to be a no-op.
* the issue as to when is serviceStop() invoked on a service when stop() is called? Currently:
not until you init(). it had better be after initing() now.

Can i also note that the ubquity of YarnClient means this class gets used a lot downstream.
Admittedly, I use it most of all, but you can essentially build yarn based apps by aggregating
their service lifecycles together. Which means there is a risk that things may change. Before
a descendant of this patch goes in, someone is going to have to have built and tested slider's
functional test suite against a version of Hadoop with this turned on. I think they'll be
able to dodge doing the same in Hive, as Hive 1.2.x still uses a cut-and-paste of the the
2.0 service model before the YARN-117 patch went in; which, if you've ever seen how Spark
Thriftserver abuses introspection to subclass (SPARK-8064, SPARK-10793) you'll be grateful

I also to know what happens to YARN-679 and YARN-1564 with this. I propose adding them first,
as that will expand the codebase, and, as much of this is code which I can migrate slider
to, will make it easier for slider to adapt to a change this fundamental.

Accordingly, I'll tag this as a depends-on there, rebase those two batches with trunk and
await reviews.

> Add states INITING and STARTING to YARN Service model to cover in-transition states.
> ------------------------------------------------------------------------------------
>                 Key: HADOOP-13035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13035
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Bibin A Chundatt
>         Attachments: 0001-HADOOP-13035.patch, 0002-HADOOP-13035.patch, 0003-HADOOP-13035.patch
> As per the discussion in YARN-3971 the we should be setting the service state to STARTED
only after serviceStart() 
> Currently {{AbstractService#start()}} is set
> {noformat} 
>      if (stateModel.enterState(STATE.STARTED) != STATE.STARTED) {
>         try {
>           startTime = System.currentTimeMillis();
>           serviceStart();
> ..
>  }
> {noformat}
> enterState sets the service state to proposed state. So in {{service.getServiceState}}
in {{serviceStart()}} will return STARTED .

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message