spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From LinCharlie <lin_q...@outlook.com>
Subject Spark On Yarn Issue: Initial job has not accepted any resources
Date Tue, 18 Nov 2014 07:53:37 GMT
Hi All:I was submitting a spark_program.jar to `spark on yarn cluster` on a driver machine
with yarn-client mode. Here is the spark-submit command I used:
./spark-submit --master yarn-client --class com.charlie.spark.grax.OldFollowersExample --queue
dt_spark ~/script/spark-flume-test-0.1-SNAPSHOT-hadoop2.0.0-mr1-cdh4.2.1.jarThe queue `dt_spark`
was free, and the program was submitted succesfully and running on the cluster.  But on console,
it showed repeatedly that:
14/11/18 15:11:48 WARN YarnClientClusterScheduler: Initial job has not accepted any resources;
check your cluster UI to ensure that workers are registered and have sufficient memory
Checked the cluster UI logs, I find no errors:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/disk5/yarn/usercache/linqili/filecache/6957209742046754908/spark-assembly-1.0.2-hadoop2.0.0-cdh4.2.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.0.0-cdh4.2.1/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/11/18 14:28:16 INFO SecurityManager: Changing view acls to: hadoop,linqili
14/11/18 14:28:16 INFO SecurityManager: SecurityManager: authentication disabled; ui acls
disabled; users with view permissions: Set(hadoop, linqili)
14/11/18 14:28:17 INFO Slf4jLogger: Slf4jLogger started
14/11/18 14:28:17 INFO Remoting: Starting remoting
14/11/18 14:28:17 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkYarnAM@longzhou-hdp3.lz.dscc:37187]
14/11/18 14:28:17 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkYarnAM@longzhou-hdp3.lz.dscc:37187]
14/11/18 14:28:17 INFO ExecutorLauncher: ApplicationAttemptId: appattempt_1415961020140_0325_000001
14/11/18 14:28:17 INFO ExecutorLauncher: Connecting to ResourceManager at longzhou-hdpnn.lz.dscc/192.168.19.107:12032
14/11/18 14:28:17 INFO ExecutorLauncher: Registering the ApplicationMaster
14/11/18 14:28:18 INFO ExecutorLauncher: Waiting for spark driver to be reachable.
14/11/18 14:28:18 INFO ExecutorLauncher: Master now available: 192.168.59.90:36691
14/11/18 14:28:18 INFO ExecutorLauncher: Listen to driver: akka.tcp://spark@192.168.59.90:36691/user/CoarseGrainedScheduler
14/11/18 14:28:18 INFO ExecutorLauncher: Allocating 1 executors.
14/11/18 14:28:18 INFO YarnAllocationHandler: Allocating 1 executor containers with 1408 of
memory each.
14/11/18 14:28:18 INFO YarnAllocationHandler: ResourceRequest (host : *, num containers: 1,
priority = 1 , capability : memory: 1408)
14/11/18 14:28:18 INFO YarnAllocationHandler: Allocating 1 executor containers with 1408 of
memory each.
14/11/18 14:28:18 INFO YarnAllocationHandler: ResourceRequest (host : *, num containers: 1,
priority = 1 , capability : memory: 1408)
14/11/18 14:28:18 INFO RackResolver: Resolved longzhou-hdp3.lz.dscc to /rack1
14/11/18 14:28:18 INFO YarnAllocationHandler: launching container on container_1415961020140_0325_01_000002
host longzhou-hdp3.lz.dscc
14/11/18 14:28:18 INFO ExecutorRunnable: Starting Executor Container
14/11/18 14:28:18 INFO ExecutorRunnable: Connecting to ContainerManager at longzhou-hdp3.lz.dscc:12040
14/11/18 14:28:18 INFO ExecutorRunnable: Setting up ContainerLaunchContext
14/11/18 14:28:18 INFO ExecutorRunnable: Preparing Local resources
14/11/18 14:28:18 INFO ExecutorLauncher: All executors have launched.
14/11/18 14:28:18 INFO ExecutorLauncher: Started progress reporter thread - sleep time : 5000
14/11/18 14:28:18 INFO YarnAllocationHandler: ResourceRequest (host : *, num containers: 0,
priority = 1 , capability : memory: 1408)
14/11/18 14:28:18 INFO ExecutorRunnable: Prepared Local resources Map(__spark__.jar ->
resource {, scheme: "hdfs", host: "longzhou-hdpnn.lz.dscc", port: 11000, file: "/user/linqili/.sparkStaging/application_1415961020140_0325/spark-assembly-1.0.2-hadoop2.0.0-cdh4.2.1.jar",
}, size: 134859131, timestamp: 1416292093988, type: FILE, visibility: PRIVATE, )
14/11/18 14:28:18 INFO ExecutorRunnable: Setting up executor with commands: List($JAVA_HOME/bin/java,
-server, -XX:OnOutOfMemoryError='kill %p', -Xms1024m -Xmx1024m , -Djava.security.krb5.conf=/home/linqili/proc/spark_client/hadoop/kerberos5-client/etc/krb5.conf
-Djava.library.path=/home/linqili/proc/spark_client/hadoop/lib/native/Linux-amd64-64, -Djava.io.tmpdir=$PWD/tmp,
 -Dlog4j.configuration=log4j-spark-container.properties, org.apache.spark.executor.CoarseGrainedExecutorBackend,
akka.tcp://spark@192.168.59.90:36691/user/CoarseGrainedScheduler, 1, longzhou-hdp3.lz.dscc,
3, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
14/11/18 14:28:23 INFO YarnAllocationHandler: ResourceRequest (host : *, num containers: 0,
priority = 1 , capability : memory: 1408)
14/11/18 14:28:23 INFO YarnAllocationHandler: Completed container container_1415961020140_0325_01_000002
(state: COMPLETE, exit status: 1)
14/11/18 14:28:23 INFO YarnAllocationHandler: Container marked as failed: container_1415961020140_0325_01_000002
14/11/18 14:28:28 INFO ExecutorLauncher: Allocating 1 containers to make up for (potentially
?) lost containers
14/11/18 14:28:28 INFO YarnAllocationHandler: Allocating 1 executor containers with 1408 of
memory each.
14/11/18 14:28:28 INFO YarnAllocationHandler: ResourceRequest (host : *, num containers: 1,
priority = 1 , capability : memory: 1408)
14/11/18 14:28:33 INFO ExecutorLauncher: Allocating 1 containers to make up for (potentially
?) lost containers
14/11/18 14:28:33 INFO YarnAllocationHandler: Allocating 1 executor containers with 1408 of
memory each.
14/11/18 14:28:33 INFO YarnAllocationHandler: ResourceRequest (host : *, num containers: 1,
priority = 1 , capability : memory: 1408)
14/11/18 14:28:33 INFO RackResolver: Resolved longzhou-hdp2.lz.dscc to /rack1
14/11/18 14:28:33 INFO YarnAllocationHandler: launching container on container_1415961020140_0325_01_000003
host longzhou-hdp2.lz.dscc
14/11/18 14:28:33 INFO ExecutorRunnable: Starting Executor Container
14/11/18 14:28:33 INFO ExecutorRunnable: Connecting to ContainerManager at longzhou-hdp2.lz.dscc:12040
14/11/18 14:28:33 INFO ExecutorRunnable: Setting up ContainerLaunchContext
14/11/18 14:28:33 INFO ExecutorRunnable: Preparing Local resources
14/11/18 14:28:33 INFO ExecutorRunnable: Prepared Local resources Map(__spark__.jar ->
resource {, scheme: "hdfs", host: "longzhou-hdpnn.lz.dscc", port: 11000, file: "/user/linqili/.sparkStaging/application_1415961020140_0325/spark-assembly-1.0.2-hadoop2.0.0-cdh4.2.1.jar",
}, size: 134859131, timestamp: 1416292093988, type: FILE, visibility: PRIVATE, )
14/11/18 14:28:33 INFO ExecutorRunnable: Setting up executor with commands: List($JAVA_HOME/bin/java,
-server, -XX:OnOutOfMemoryError='kill %p', -Xms1024m -Xmx1024m , -Djava.security.krb5.conf=/home/linqili/proc/spark_client/hadoop/kerberos5-client/etc/krb5.conf
-Djava.library.path=/home/linqili/proc/spark_client/hadoop/lib/native/Linux-amd64-64, -Djava.io.tmpdir=$PWD/tmp,
 -Dlog4j.configuration=log4j-spark-container.properties, org.apache.spark.executor.CoarseGrainedExecutorBackend,
akka.tcp://spark@192.168.59.90:36691/user/CoarseGrainedScheduler, 2, longzhou-hdp2.lz.dscc,
3, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
14/11/18 14:28:38 INFO YarnAllocationHandler: ResourceRequest (host : *, num containers: 0,
priority = 1 , capability : memory: 1408)
14/11/18 14:28:38 INFO YarnAllocationHandler: Ignoring container container_1415961020140_0325_01_000004
at host longzhou-hdp2.lz.dscc, since we already have the required number of containers for
it.
14/11/18 14:28:38 INFO YarnAllocationHandler: Completed container container_1415961020140_0325_01_000003
(state: COMPLETE, exit status: 1)
14/11/18 14:28:38 INFO YarnAllocationHandler: Container marked as failed: container_1415961020140_0325_01_000003
14/11/18 14:28:43 INFO ExecutorLauncher: Allocating 1 containers to make up for (potentially
?) lost containers
14/11/18 14:28:43 INFO YarnAllocationHandler: Releasing 1 containers. pendingReleaseContainers
: {container_1415961020140_0325_01_000004=true}
14/11/18 14:28:43 INFO YarnAllocationHandler: Allocating 1 executor containers with 1408 of
memory each.
14/11/18 14:28:43 INFO YarnAllocationHandler: ResourceRequest (host : *, num containers: 1,
priority = 1 , capability : memory: 1408)
14/11/18 14:28:48 INFO ExecutorLauncher: Allocating 1 containers to make up for (potentially
?) lost containers
14/11/18 14:28:48 INFO YarnAllocationHandler: Allocating 1 executor containers with 1408 of
memory each.
14/11/18 14:28:48 INFO YarnAllocationHandler: ResourceRequest (host : *, num containers: 1,
priority = 1 , capability : memory: 1408)
14/11/18 14:28:48 INFO YarnAllocationHandler: launching container on container_1415961020140_0325_01_000005
host longzhou-hdp2.lz.dscc
14/11/18 14:28:48 INFO ExecutorRunnable: Starting Executor Container
14/11/18 14:28:48 INFO ExecutorRunnable: Connecting to ContainerManager at longzhou-hdp2.lz.dscc:12040
14/11/18 14:28:48 INFO ExecutorRunnable: Setting up ContainerLaunchContext
14/11/18 14:28:48 INFO ExecutorRunnable: Preparing Local resources
14/11/18 14:28:48 INFO ExecutorRunnable: Prepared Local resources Map(__spark__.jar ->
resource {, scheme: "hdfs", host: "longzhou-hdpnn.lz.dscc", port: 11000, file: "/user/linqili/.sparkStaging/application_1415961020140_0325/spark-assembly-1.0.2-hadoop2.0.0-cdh4.2.1.jar",
}, size: 134859131, timestamp: 1416292093988, type: FILE, visibility: PRIVATE, )
14/11/18 14:28:48 INFO ExecutorRunnable: Setting up executor with commands: List($JAVA_HOME/bin/java,
-server, -XX:OnOutOfMemoryError='kill %p', -Xms1024m -Xmx1024m , -Djava.security.krb5.conf=/home/linqili/proc/spark_client/hadoop/kerberos5-client/etc/krb5.conf
-Djava.library.path=/home/linqili/proc/spark_client/hadoop/lib/native/Linux-amd64-64, -Djava.io.tmpdir=$PWD/tmp,
 -Dlog4j.configuration=log4j-spark-container.properties, org.apache.spark.executor.CoarseGrainedExecutorBackend,
akka.tcp://spark@192.168.59.90:36691/user/CoarseGrainedScheduler, 3, longzhou-hdp2.lz.dscc,
3, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
14/11/18 14:28:53 INFO YarnAllocationHandler: ResourceRequest (host : *, num containers: 0,
priority = 1 , capability : memory: 1408)
14/11/18 14:28:53 INFO YarnAllocationHandler: Ignoring container container_1415961020140_0325_01_000006
at host longzhou-hdp2.lz.dscc, since we already have the required number of containers for
it.Is there any hint? Thanks. 		 	   		  
Mime
View raw message