spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Spitzer <russell.spit...@gmail.com>
Subject Re: Path of jars added to a Spark Job - spark-submit // // Override jars in spark submit
Date Thu, 12 Nov 2020 16:34:36 GMT
--driver-class-path does not move jars, so it is dependent on your Spark
resource manager (master). It is interpreted literally so if your files do
not exist in the location you provide relative where the driver is run,
they will not be placed on the classpath.

Since the driver is responsible for moving jars specified in --jars, you
cannot use a jar specified by --jars to be in driver-class-path, since the
driver is already started and it's classpath is already set before any jars
are moved.

Some distributions may change this behavior though, but this is the jist of
it.

On Thu, Nov 12, 2020 at 10:02 AM Dominique De Vito <ddv36a78@gmail.com>
wrote:

> Hi,
>
> I am using Spark 2.1 (BTW) on YARN.
>
> I am trying to upload JAR on YARN cluster, and to use them to replace
> on-site (alreading in-place) JAR.
>
> I am trying to do so through spark-submit.
>
> One helpful answer
> https://stackoverflow.com/questions/37132559/add-jars-to-a-spark-job-spark-submit/37348234
> is the following one:
>
> spark-submit --jars additional1.jar,additional2.jar \
>   --driver-class-path additional1.jar:additional2.jar \
>   --conf spark.executor.extraClassPath=additional1.jar:additional2.jar \
>   --class MyClass main-application.jar
>
> So, I understand the following:
>
>    - "--jars" is for uploading jar on each node
>    - "--driver-class-path" is for using uploaded jar for the driver.
>    - "--conf spark.executor.extraClassPath" is for using uploaded jar for
>    executors.
>
> While I master the filepaths for "--jars" within a spark-submit command,
> what will be the filepaths of the uploaded JAR to be used in
> "--driver-class-path" for example ?
>
> The doc says: "*JARs and files are copied to the working directory for
> each SparkContext on the executor nodes*"
>
> Fine, but for the following command, what should I put instead of XXX and
> YYY ?
>
> spark-submit --jars /a/b/some1.jar,/a/b/c/some2.jar \
>   --driver-class-path XXX:YYY \
>   --conf spark.executor.extraClassPath=XXX:YYY \
>   --class MyClass main-application.jar
>
> When using spark-submit, how can I reference the "*working directory for
> the SparkContext*" to form XXX and YYY filepath ?
>
> Thanks.
>
> Dominique
>
> PS: I have tried
>
> spark-submit --jars /a/b/some1.jar,/a/b/c/some2.jar \
>   --driver-class-path some1.jar:some2.jar \
>   --conf spark.executor.extraClassPath=some1.jar:some2.jar  \
>   --class MyClass main-application.jar
>
> No success (if I made no mistake)
>
> And I have tried also:
>
> spark-submit --jars /a/b/some1.jar,/a/b/c/some2.jar \
>    --driver-class-path ./some1.jar:./some2.jar \
>    --conf spark.executor.extraClassPath=./some1.jar:./some2.jar \
>    --class MyClass main-application.jar
>
> No success either.
>

Mime
View raw message