hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastiano Spicuglia <spicuglia.sebasti...@gmail.com>
Subject Problem with NodeManager and cgroups
Date Thu, 01 May 2014 09:56:05 GMT
Hello all,

I have a problem with the NodeManager and cgroups.
When I try to start a mapreduce job I get the following error
in the log of NodeManager (DEBUG ON):
...
2014-05-01 11:02:35,401 WARN
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit
code from container container_1398934193480_0003_02_000001 is : 27
2014-05-01 11:02:35,401 WARN
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
Exception from container-launch with container ID:
container_1398934193480_0003_02_000001 and exit code: 27
org.apache.hadoop.util.Shell$ExitCodeException:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
        at org.apache.hadoop.util.Shell.run(Shell.java:418)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:278)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
2014-05-01 11:02:35,401 INFO
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main :
command provided 1
2014-05-01 11:02:35,401 INFO
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main :
user is hduser
2014-05-01 11:02:35,401 INFO
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main :
requested yarn user is root
2014-05-01 11:02:35,401 INFO
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Can't
open file /sys/fs/cgroup/cpu as node manager - Is a directory
...

After a bit of grep, I found the source of the problem:

1) container-executor.c:write_pid_to_file_as_nm() tries to write in
/sys/fs/cgroup/cpu.

2) The container-executor is started with the following resource description:
cgroups=/sys/fs/cgroup/cpu,cpuacct/hadoop-yarn/container_1398934193480_0003_02_000001/tasks

This is due to my cgroup configuration:
cat /proc/mounts | grep cgroup/cpu
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup
rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0

3) The problem is that I have a comma in the directory where I mounted
the subsystem and
the comma is the separator used by configuration.c:extract_values to
parse the resource
descrption.

Should I change my cgroup configuration? Or is it possible to use
another separator in
the resource description? For instance a reserved characters such as : (colon)?

Thank you in advance
Best regards
Sebastiano

Mime
View raw message