mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anand Mazumdar" <mazumdar.an...@gmail.com>
Subject Re: Review Request 38874: Refactored executor struct in Agent for the Executor HTTP API
Date Wed, 14 Oct 2015 00:07:58 GMT


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.hpp, line 583
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line583>
> >
> >     Could you add the closed() method here to make this consistent with the scheduler's
HttpConnection struct in the master? We may want to consolidate these two structs later.

Done


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.hpp, lines 622-625
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line622>
> >
> >     Looks like you also want to warn about TERMINATING?
> >     
> >     How about just printing the state instead of saying "disconnected" to be a bit
more explicit?

When the executor is launched, the agent sets its `state = REGISTERING`. So it looks fine
to have the `state == REGISTERING` check.

However, what I had missed was, when the agent sends an executor a `Shutdown` event, it marks
it's state as `TERMINATING` and then sends the event. In short, we would always be disconnected
when the states are `TERMINATED/REGISTERING`. I added the state to the `LOG(WARNING)`. Thanks
for the suggestion.


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.cpp, lines 5117-5119
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line5117>
> >
> >     Per our chat and my other suggestion, hoping we can turn this case into neither
'pid' nor 'http' being set.

Fixed


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.hpp, line 671
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line671>
> >
> >     Does this first "when" belong? It seems a bit confusing that we have this pid
= UPID() case, ideally we can be explicit about when we don't know the connection type yet
by having both set to None, i.e. the equivalent of:
> >     
> >     ```
> >     Either<None, UPID, HttpConnection>
> >     ```

As we had discussed offline, `PID/HTTP` would be both `None()` on launch. While recovering,
if the agent is not able to find the `pid/http` marker file it would set `pid = UPID()`.


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.cpp, line 2391
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line2391>
> >
> >     Ah, I meant to remove `reply` completely due to it being error prone (the implicit
use of a UPID means it breaks if called asynchronously), see:
> >     
> >     https://issues.apache.org/jira/browse/MESOS-765
> >     Here's a case of it causing a bug: https://issues.apache.org/jira/browse/MESOS-1310
> >     
> >     We should update these in a separate patch to do explicit sending based on `from`
and remove `reply` completely.

Sending in a patch for these shortly :)


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.cpp, line 2840
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line2840>
> >
> >     It's a little weird to have these pid checks down here.
> >     
> >     I was going to suggest having a CHECK_SOME at the top of each message handler,
but I'm not sure that's safe (can these paths get triggered when 'pid' is none?).

Previously, we supported other executors to send `statusUpdates` on behalf of other executors
as you had pointed out. If the other executor has not yet `registered`, the value of `executor->pid`
would still be `None()` for it. So , I had to include these extra `isSome()` checks since
these paths can get triggered when `pid` is None()


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.cpp, lines 2850-2851
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line2850>
> >
> >     Just for some context, we originally added this for Aurora's GC executor, which
has now been obsoleted by reconciliation and is no longer in use.
> >     
> >     With the HTTP api, this gets tricker to support (we can't use PIDs and have
to use session identifiers for this). Given that, I'm not sure if we should continue supporting
it in the HTTP code paths.

Based on my conversations with Vinod + what we had already discussed, we won't be supporting
the ability of `HTTP` executors to send messages on behalf of others. Hopefully, that should
make the code-paths in `MESOS-3476` much simpler.


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.hpp, lines 668-669
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line668>
> >
> >     "libprocess message passing"

Fixed


- Anand


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38874/#review101830
-----------------------------------------------------------


On Oct. 6, 2015, 2:22 a.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38874/
> -----------------------------------------------------------
> 
> (Updated Oct. 6, 2015, 2:22 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Isabel Jimenez, and Vinod Kone.
> 
> 
> Bugs: MESOS-3480
>     https://issues.apache.org/jira/browse/MESOS-3480
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This change refactors the Executor struct on Agent and adds support for Executors to
connect via the `api/v1/executor` endpoint on Agent. This is similar to the change done in
Master for the Scheduler HTTP API.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
> 
> Diff: https://reviews.apache.org/r/38874/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message