From reviews-return-38495-apmail-mesos-reviews-archive=mesos.apache.org@mesos.apache.org Tue Jun 28 17:08:32 2016 Return-Path: X-Original-To: apmail-mesos-reviews-archive@minotaur.apache.org Delivered-To: apmail-mesos-reviews-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 43B7D19D5D for ; Tue, 28 Jun 2016 17:08:32 +0000 (UTC) Received: (qmail 16834 invoked by uid 500); 28 Jun 2016 17:08:32 -0000 Delivered-To: apmail-mesos-reviews-archive@mesos.apache.org Received: (qmail 16805 invoked by uid 500); 28 Jun 2016 17:08:31 -0000 Mailing-List: contact reviews-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: reviews@mesos.apache.org Delivered-To: mailing list reviews@mesos.apache.org Received: (qmail 16784 invoked by uid 99); 28 Jun 2016 17:08:31 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jun 2016 17:08:31 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 6FFFB2BD2CF; Tue, 28 Jun 2016 17:08:31 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============7934614010720430883==" MIME-Version: 1.0 Subject: Re: Review Request 48373: Integrated the `NvidiaGpuAllocator` into the `NvidiaGpuIsolator`. From: Kevin Klues To: Benjamin Mahler Cc: Kevin Klues , mesos Date: Tue, 28 Jun 2016 17:08:31 -0000 Message-ID: <20160628170831.1635.74009@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Kevin Klues X-ReviewGroup: mesos X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/48373/ X-Sender: Kevin Klues References: <20160628025510.24740.75498@reviews.apache.org> In-Reply-To: <20160628025510.24740.75498@reviews.apache.org> Reply-To: Kevin Klues X-ReviewRequest-Repository: mesos --===============7934614010720430883== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On June 28, 2016, 2:55 a.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/nvidia.cpp, lines 442-446 > > > > > > Unfortunately this won't accomplish your intent of providing context in the failure message because the .onFailed method does not provide composition, your returned Failure here will fall into the void. > > > > To accomplish this, we need a .then which can accept a Future in addition to being able to take a T. Yes, I see this now. The .then is just chained onto the .onFailed. In fact, the way I had it before, I think we never would have actually cleaned up anything (the .onFailed() was the only thing chained to the original Future). > On June 28, 2016, 2:55 a.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/nvidia.cpp, line 375 > > > > > > Hm.. seems we should do another lookup of the 'info' since we are returning to the isolator context and things may have occurred in the interim. Perhaps it's worth pulling out an `_update` continuation to force this (no captures possible). I see you committed this with: ``` .then(defer(PID(this), &NvidiaGpuIsolatorProcess::_update, containerId, lambda::_1)); ``` Why `PID(this)` instead of `self()`? Also, I see why it makes sense to re-lookup `info`, but I'm unclear on why the continuation had to be pulled out into a separate function. The logic feels very disjoint now when adding GPUs vs. taking them away. Maybe it's just the name `_update` since it only actually gets triggered in the "adding" GPUs case. > On June 28, 2016, 2:55 a.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/nvidia.cpp, line 447 > > > > > > This needs to defer to self since the infos map is managed by the isolator actor. In general, capturing and using 'this' within a continuation without defering to self is dangerous. Yes. This was an oversight. When I started thsi patch I was unclear on the need for `defer()` inside the `.then()`, and looks like I missed one when doing a pass over the code to add it. > On June 28, 2016, 2:55 a.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/nvidia.cpp, line 261 > > > > > > Hm.. wonder why we didn't just use foreach here? I believe the original code needed a normal `for` look for some reason and the new code just inherited this form. I agree -- a `foreach` is more appropriate here now though. > On June 28, 2016, 2:55 a.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/nvidia.cpp, lines 448-451 > > > > > > To guard against races, how about: > > > > ``` > > CHECK(infos.contains(containerId)); > > delete infos.at(containerId); > > infos.erase(containerId); > > > > return Nothing(); > > ``` Makes sense. It's the same reasoning as for `update()` with needing to re-lookup `info` (only this time we don't need to store a temporary to it since we are just deleting it). - Kevin ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48373/#review139714 ----------------------------------------------------------- On June 23, 2016, 4:34 p.m., Kevin Klues wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/48373/ > ----------------------------------------------------------- > > (Updated June 23, 2016, 4:34 p.m.) > > > Review request for mesos and Benjamin Mahler. > > > Bugs: MESOS-5559 > https://issues.apache.org/jira/browse/MESOS-5559 > > > Repository: mesos > > > Description > ------- > > All logic to do GPU allocation is now conducted asynchronously through > the `NvidiaGpuAllocator` component. > > > Diffs > ----- > > src/slave/containerizer/mesos/isolators/gpu/nvidia.hpp 28ad0b3d1d8e7642a93e4ebb0fcc399e182be393 > src/slave/containerizer/mesos/isolators/gpu/nvidia.cpp cfb23244d0e6f6e0d94685a35826b35f3297c6fe > > Diff: https://reviews.apache.org/r/48373/diff/ > > > Testing > ------- > > GTEST_FILTER="" make -j check && sudo GTEST_FILTER="*NVIDIA*" src/mesos-tests > > > Thanks, > > Kevin Klues > > --===============7934614010720430883==--