spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Masood Krohy <masood.kr...@intact.net>
Subject Re: Cluster deploy mode driver location
Date Tue, 22 Nov 2016 16:47:07 GMT
You may also try distributing your JARS along with your Spark app; see 
options below. You put on the client node whatever that is necessary and 
submit them all in each run. There is also a --files option which you can 
remove below, but may be helpful for some configs.

You do not need to specify all the arguments; the default values are 
picked up when not explicitly given.

spark-submit \
--master yarn \
--deploy-mode cluster \
--num-executors 2 \
--driver-memory 4g \
--executor-memory 8g \
--files /usr/hdp/current/spark-client/conf/hive-site.xml \
--jars 
/usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar

\
--class "SparkApp"  \
/pathToAppOnTheClientNode/SparkApp.jar (if any, arguments passed to the 
Spark App here)

Masood


------------------------------
Masood Krohy, Ph.D. 
Data Scientist, Intact Lab-R&D 
Intact Financial Corporation 
http://ca.linkedin.com/in/masoodkh 



De :    Silvio Fiorito <silvio.fiorito@granturing.com>
A :     "Saif.A.Ellafi@wellsfargo.com" <Saif.A.Ellafi@wellsfargo.com>, 
"user@spark.apache.org" <user@spark.apache.org>
Date :  2016-11-22 08:02
Objet : Re: Cluster deploy mode driver location



Hi Saif!

Unfortunately, I don't think this is possible for YARN driver-cluster 
mode. Regarding the JARs you're referring to, can you place them on HDFS 
so you can then have them in a central location and refer to them that way 
for dependencies?

http://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management

Thanks,
Silvio

From: Saif.A.Ellafi@wellsfargo.com <Saif.A.Ellafi@wellsfargo.com>
Sent: Monday, November 21, 2016 2:04:06 PM
To: user@spark.apache.org
Subject: Cluster deploy mode driver location 
 
Hello there,
 
I have a Spark program in 1.6.1, however, when I submit it to cluster, it 
randomly picks the driver.
 
I know there is a driver specification option, but along with it it is 
mandatory to define many other options I am not familiar with. The trouble 
is, the .jars I am launching need to be available at the driver host, and 
I would like to have this jars in just a specific host, which I like it to 
be the driver.
 
Any help?
 
Thanks!
Saif
 


Mime
View raw message