hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hardik Pandya <smarty.ju...@gmail.com>
Subject Re: Problem with NodeManager and cgroups
Date Fri, 02 May 2014 15:32:16 GMT
looking at yarn-default.xml

do not think comma is going tow work for you

yarn.nodemanager.linux-container-executor.cgroups.hierarchy/hadoop-yarnThe
cgroups hierarchy under which to place YARN proccesses (*cannot contain
commas*). If yarn.nodemanager.linux-container-executor.cgroups.mount is
false (that is, if cgroups have been pre-configured), then this cgroups
hierarchy must already exist and be writable by the NodeManager user,
otherwise the NodeManager may fail. Only used when the LCE resources
handler is set to the CgroupsLCEResourcesHandler.


On Thu, May 1, 2014 at 5:56 AM, Sebastiano Spicuglia <
spicuglia.sebastiano@gmail.com> wrote:

> Hello all,
>
> I have a problem with the NodeManager and cgroups.
> When I try to start a mapreduce job I get the following error
> in the log of NodeManager (DEBUG ON):
> ...
> 2014-05-01 11:02:35,401 WARN
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit
> code from container container_1398934193480_0003_02_000001 is : 27
> 2014-05-01 11:02:35,401 WARN
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
> Exception from container-launch with container ID:
> container_1398934193480_0003_02_000001 and exit code: 27
> org.apache.hadoop.util.Shell$ExitCodeException:
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
>         at org.apache.hadoop.util.Shell.run(Shell.java:418)
>         at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
>         at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:278)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> 2014-05-01 11:02:35,401 INFO
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main :
> command provided 1
> 2014-05-01 11:02:35,401 INFO
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main :
> user is hduser
> 2014-05-01 11:02:35,401 INFO
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main :
> requested yarn user is root
> 2014-05-01 11:02:35,401 INFO
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Can't
> open file /sys/fs/cgroup/cpu as node manager - Is a directory
> ...
>
> After a bit of grep, I found the source of the problem:
>
> 1) container-executor.c:write_pid_to_file_as_nm() tries to write in
> /sys/fs/cgroup/cpu.
>
> 2) The container-executor is started with the following resource
> description:
>
> cgroups=/sys/fs/cgroup/cpu,cpuacct/hadoop-yarn/container_1398934193480_0003_02_000001/tasks
>
> This is due to my cgroup configuration:
> cat /proc/mounts | grep cgroup/cpu
> cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset
> 0 0
> cgroup /sys/fs/cgroup/cpu,cpuacct cgroup
> rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0
>
> 3) The problem is that I have a comma in the directory where I mounted
> the subsystem and
> the comma is the separator used by configuration.c:extract_values to
> parse the resource
> descrption.
>
> Should I change my cgroup configuration? Or is it possible to use
> another separator in
> the resource description? For instance a reserved characters such as :
> (colon)?
>
> Thank you in advance
> Best regards
> Sebastiano
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message