spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Masood Krohy <>
Subject Re: Cluster deploy mode driver location
Date Tue, 22 Nov 2016 16:47:07 GMT
You may also try distributing your JARS along with your Spark app; see 
options below. You put on the client node whatever that is necessary and 
submit them all in each run. There is also a --files option which you can 
remove below, but may be helpful for some configs.

You do not need to specify all the arguments; the default values are 
picked up when not explicitly given.

spark-submit \
--master yarn \
--deploy-mode cluster \
--num-executors 2 \
--driver-memory 4g \
--executor-memory 8g \
--files /usr/hdp/current/spark-client/conf/hive-site.xml \

--class "SparkApp"  \
/pathToAppOnTheClientNode/SparkApp.jar (if any, arguments passed to the 
Spark App here)


Masood Krohy, Ph.D. 
Data Scientist, Intact Lab-R&D 
Intact Financial Corporation 

De :    Silvio Fiorito <>
A :     "" <>, 
"" <>
Date :  2016-11-22 08:02
Objet : Re: Cluster deploy mode driver location

Hi Saif!

Unfortunately, I don't think this is possible for YARN driver-cluster 
mode. Regarding the JARs you're referring to, can you place them on HDFS 
so you can then have them in a central location and refer to them that way 
for dependencies?


From: <>
Sent: Monday, November 21, 2016 2:04:06 PM
Subject: Cluster deploy mode driver location 
Hello there,
I have a Spark program in 1.6.1, however, when I submit it to cluster, it 
randomly picks the driver.
I know there is a driver specification option, but along with it it is 
mandatory to define many other options I am not familiar with. The trouble 
is, the .jars I am launching need to be available at the driver host, and 
I would like to have this jars in just a specific host, which I like it to 
be the driver.
Any help?

View raw message