spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liu, Raymond" <raymond....@intel.com>
Subject RE: Anyone know hot to submit spark job to yarn in java code?
Date Thu, 16 Jan 2014 00:52:12 GMT
Hi

Regarding your question

1) when I run the above script, which jar is beed submitted to the yarn server ? 

What SPARK_JAR env point to and the --jar point to are both submitted to the yarn server

2) It like the  spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar plays the role of client
side and spark-examples-assembly-0.8.1-incubating.jar goes with spark runtime and examples
which will be running in yarn, am I right?

The spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar will also go to yarn cluster as
runtime for app jar(spark-examples-assembly-0.8.1-incubating.jar)

3) Does anyone have any similar experience ? I did lots of hadoop MR stuff and want follow
the same logic to submit spark job. For now I can only find the command line way to submit
spark job to yarn. I believe there is a easy way to integration spark in a web allocation.

You can use the yarn-client mode, you might want to take a look on docs/running-on-yarn.md,
and probably you might want to try master branch to check our latest update on this part of
docs. And in yarn client mode, the sparkcontext itself will do similar thing as what the command
line is doing to submit a yarn job

Then to use it with java, you might want to try out JavaSparkContext instead of SparkContext,
I don't personally run it with complicated applications. But a small example app did works.
	

Best Regards,
Raymond Liu

-----Original Message-----
From: John Zhao [mailto:jzhao@alpinenow.com] 
Sent: Thursday, January 16, 2014 2:25 AM
To: user@spark.incubator.apache.org
Subject: Anyone know hot to submit spark job to yarn in java code?

Now I am working on a web application and  I want to  submit a spark job to hadoop yarn.
I have already do my own assemble and  can run it in command line by the following script:

export YARN_CONF_DIR=/home/gpadmin/clusterConfDir/yarn
export SPARK_JAR=./assembly/target/scala-2.9.3/spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar
./spark-class org.apache.spark.deploy.yarn.Client  --jar ./examples/target/scala-2.9.3/spark-examples-assembly-0.8.1-incubating.jar
 --class org.apache.spark.examples.SparkPi --args yarn-standalone --num-workers 3 --master-memory
1g --worker-memory 512m --worker-cores 1    

It works fine.
The I realized that it is hard to submit the job from a web application .Looks like the spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar
or spark-examples-assembly-0.8.1-incubating.jar is a really big jar. I believe it contains
everything .
So my question is :
1) when I run the above script, which jar is beed submitted to the yarn server ? 
2) It loos like the  spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar plays the role
of client side and spark-examples-assembly-0.8.1-incubating.jar goes with spark runtime and
examples which will be running in yarn, am I right?
3) Does anyone have any similar experience ? I did lots of hadoop MR stuff and want follow
the same logic to submit spark job. For now I can only find the command line way to submit
spark job to yarn. I believe there is a easy way to integration spark in a web allocation.
 


Thanks.
John.

Mime
View raw message