spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Holmberg <jan.holmb...@perigeum.fi>
Subject Re: Status stays at ACCEPTED
Date Tue, 20 May 2014 13:16:27 GMT
Still the same. I increased the memory of the node holding resource manager to 5 Gig. I also
spotted an HDFS alert of replication factor 3 that I now dropped to the number of data nodes.
I also shut all down all services not in use. Still the issue remains.

I have noticed following two events that are fired when I start the Spark run :

Zookeeper : caught end of stream exception
Yarn : The specific max attempts: 0 for application: 1 is invalid, because it is out of the
range [1, 2]. Use the global max attempts

-jan


On 20 May 2014, at 11:14, Jan Holmberg <jan.holmberg@perigeum.fi<mailto:jan.holmberg@perigeum.fi>>
wrote:

Hi,
each node has 4Gig of memory. After total reboot and re-run of SparkPi  resource manager shows
no running containers and 1 pending container.

-jan

On 20 May 2014, at 10:24, <sandy.ryza@cloudera.com<mailto:sandy.ryza@cloudera.com>>
<sandy.ryza@cloudera.com<mailto:sandy.ryza@cloudera.com>> wrote:

Hi Jan,

How much memory capacity is configured for each node?

If you go to the ResourceManager web UI, does it indicate any containers are running?

-Sandy

On May 19, 2014, at 11:43 PM, Jan Holmberg <jan.holmberg@perigeum.fi<mailto:jan.holmberg@perigeum.fi>>
wrote:

Hi,
I’m new to Spark and trying to test first Spark prog. I’m running SparkPi successfully
in yarn-client -mode but when running the same in yarn-mode, app gets stuck to ACCEPTED phase.
I’ve tried hours to hunt down the reason but the outcome is always the same. Any hints what
to look for next?

cheers,
-jan


vagrant@vm-cluster-node1:~$ ./run_pi.sh
14/05/20 06:24:04 INFO RMProxy: Connecting to ResourceManager at vm-cluster-node2/10.211.55.101:8032
14/05/20 06:24:05 INFO Client: Got Cluster metric info from ApplicationsManager (ASM), number
of NodeManagers: 2
14/05/20 06:24:05 INFO Client: Queue info ... queueName: root.default, queueCurrentCapacity:
0.0, queueMaxCapacity: -1.0,
   queueApplicationCount = 0, queueChildQueueCount = 0
14/05/20 06:24:05 INFO Client: Max mem capabililty of a single resource in this cluster 2048
14/05/20 06:24:05 INFO Client: Preparing Local resources
14/05/20 06:24:05 INFO Client: Uploading file:/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/spark/assembly/lib/spark-assembly_2.10-0.9.0-cdh5.0.0-hadoop2.3.0-cdh5.0.0.jar
to hdfs://vm-cluster-node2:8020/user/vagrant/.sparkStaging/application_1400563733088_0012/spark-assembly_2.10-0.9.0-cdh5.0.0-hadoop2.3.0-cdh5.0.0.jar
14/05/20 06:24:07 INFO Client: Setting up the launch environment
14/05/20 06:24:07 INFO Client: Setting up container launch context
14/05/20 06:24:07 INFO Client: Command for starting the Spark ApplicationMaster: java -server
-Xmx1024m -Djava.io.tmpdir=$PWD/tmp org.apache.spark.deploy.yarn.ApplicationMaster --class
org.apache.spark.examples.SparkPi --jar /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/spark/assembly/lib/spark-assembly_2.10-0.9.0-cdh5.0.0-hadoop2.3.0-cdh5.0.0.jar
--args  'yarn-standalone'  --args  '10'  --worker-memory 500 --worker-cores 1 --num-workers
1 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
14/05/20 06:24:07 INFO Client: Submitting application to ASM
14/05/20 06:24:07 INFO YarnClientImpl: Submitted application application_1400563733088_0012
14/05/20 06:24:08 INFO Client: Application report from ASM: <THIS PART GET REPEATING FOREVER>
  application identifier: application_1400563733088_0012
  appId: 12
  clientToAMToken: null
  appDiagnostics:
  appMasterHost: N/A
  appQueue: root.vagrant
  appMasterRpcPort: -1
  appStartTime: 1400567047343
  yarnAppState: ACCEPTED
  distributedFinalState: UNDEFINED
  appTrackingUrl: http://vm-cluster-node2:8088/proxy/application_1400563733088_0012/
  appUser: vagrant


Log files give me no additional help. Latest log entry just acknowledges the status change:

hadoop-yarn/hadoop-cmf-yarn-RESOURCEMANAGER-vm-cluster-node2.log.out:2014-05-20 06:24:07,347
INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1400563733088_0012
State change from SUBMITTED to ACCEPTED


I’m running the example in local test environment with three virtual nodes in Cloudera (CDH5).

Below is the run_pi.sh :

#!/bin/bash

export SPARK_HOME=/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/spark
export STANDALONE_SPARK_MASTER_HOST=vm-cluster-node2
export SPARK_MASTER_PORT=7077
export DEFAULT_HADOOP_HOME=/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hadoop

export SPARK_JAR_HDFS_PATH=/user/spark/share/lib/spark-assembly.jar

export SPARK_LAUNCH_WITH_SCALA=0
export SPARK_LIBRARY_PATH=${SPARK_HOME}/lib
export SCALA_LIBRARY_PATH=${SPARK_HOME}/lib
export SPARK_MASTER_IP=$STANDALONE_SPARK_MASTER_HOST

export HADOOP_HOME=${HADOOP_HOME:-$DEFAULT_HADOOP_HOME}

if [ -n "$HADOOP_HOME" ]; then
export SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:${HADOOP_HOME}/lib/native
fi
export SPARK_JAR=hdfs://vm-cluster-node2:8020/user/spark/share/lib/spark-assembly.jar

APP_JAR=/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/spark/assembly/lib/spark-assembly_2.10-0.9.0-cdh5.0.0-hadoop2.3.0-cdh5.0.0.jar

$SPARK_HOME/bin/spark-class org.apache.spark.deploy.yarn.Client \
--jar $APP_JAR \
--class org.apache.spark.examples.SparkPi \
--args yarn-standalone \
--args 10 \
--num-workers 1 \
--master-memory 1g \
--worker-memory 500m \
--worker-cores 1






Mime
View raw message