mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Klues <klue...@gmail.com>
Subject Re: Review Request 44366: Added GPUs as an explicit resource.
Date Mon, 14 Mar 2016 07:39:36 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44366/
-----------------------------------------------------------

(Updated March 14, 2016, 7:39 a.m.)


Review request for mesos, Ben Mahler, Rob Todd, and Vikrama Ditya.


Changes
-------

Adressed all of klaus1982, bmahler, and vditya's comments.


Bugs: MESOS-4865
    https://issues.apache.org/jira/browse/MESOS-4865


Repository: mesos


Description (updated)
-------

Currently, we enforce that the number of GPUs specified in the 'gpus'
resource parameter equal the number of GPUs passed in via the
--nvidia_gpu_devices flag. In the future, we will generalize this via
autodiscovery of GPUs and support for GPU types other than Nvidia.


Diffs (updated)
-----

  include/mesos/resources.hpp bb343ad852576a75615a93ef850b413bf77698e0 
  include/mesos/v1/resources.hpp 719110fbbf39f1755460ac0b32e3893656054a4e 
  src/common/resources.cpp 1f23dc83f7330c305a836d698f114b7eaf3d7ba1 
  src/slave/containerizer/containerizer.cpp f6fc7863d0c215611f170dc0c89aa229407b5137 
  src/v1/resources.cpp c6f125ec317e2da537a6456f5cff2da0a48701d8 

Diff: https://reviews.apache.org/r/44366/diff/


Testing (updated)
-------

./bin/mesos-slave.sh --master=127.0.0.1:5050 --resources="gpus:string"
Failed to determine slave resources: Bad type for resource gpus value string type TEXT

./bin/mesos-slave.sh --master=127.0.0.1:5050 --resources="gpus:4.9"
Failed to determine slave resources: The `gpus` resource must specified as an unsigned integer
   
./bin/mesos-slave.sh --master=127.0.0.1:5050 --resources="gpus:4.0"
Failed to determine slave resources: When specifying the `gpus` resource, you must also specify
a list of GPUs via the `--nvidia_gpu_devices` flag

./bin/mesos-slave.sh --master=127.0.0.1:5050 --resources="gpus:4.0" --nvidia_gpu_devices=1,2,3
Failed to determine slave resources: The number of GPUs passed in the `--nvidia_gpu_devices`
flag must match the number of GPUs specified in the `gpus` resource

./bin/mesos-slave.sh --master=127.0.0.1:5050 --resources="gpus:4.0" --nvidia_gpu_devices=1,2,3,4
SUCCESS

**NOTE**: I didn't set the `--isolation` flag here, so the agent actually started up correctly
(i.e. it didn't error out saying that the Nvidia GPU isolator is currently unsupported). 
This is the correct behaviour, and it properly exercises the code added in this patch.


Thanks,

Kevin Klues


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message