spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Yu <ja...@ispot.tv>
Subject Re: Where do the executors get my app jar from?
Date Fri, 14 Aug 2020 15:08:41 GMT
Henoc,

Ok. That is for Yarn with HDFS. What will happen in Kubernetes as resource manager without
HDFS scenario?

James

________________________________
From: Henoc <mukadi.kalombo@gmail.com>
Sent: Thursday, August 13, 2020 10:45 PM
To: James Yu <james@ispot.tv>
Cc: user <user@spark.apache.org>; russell.spitzer@gmail.com <russell.spitzer@gmail.com>
Subject: Re: Where do the executors get my app jar from?

If you are running Spark on Yarn, the spark-submit utility will download the jar from S3 and
copy it to HDFS in a distributed cache. The driver shares this location with Yarn NodeManagers
via the container LaunchContext. NodeManagers localize the jar and place it on container classpath
before they launch the executor container

Henoc

On Fri, Aug 14, 2020, 6:19 AM Russell Spitzer <russell.spitzer@gmail.com<mailto:russell.spitzer@gmail.com>>
wrote:
Looking back at the code

All --jar Args and such run through

https://github.com/apache/spark/blob/7f275ee5978e00ac514e25f5ef1d4e3331f8031b/core/src/main/scala/org/apache/spark/SparkContext.scala#L493-L500

Which calls

https://github.com/apache/spark/blob/7f275ee5978e00ac514e25f5ef1d4e3331f8031b/core/src/main/scala/org/apache/spark/SparkContext.scala#L1842

Which places local jars on the driver hosted file server and just leaves Remote Jars as is
with the path for executors to access them

On Thu, Aug 13, 2020 at 11:01 PM Russell Spitzer <russell.spitzer@gmail.com<mailto:russell.spitzer@gmail.com>>
wrote:
The driver hosts a file server which the executors download the jar from.

On Thu, Aug 13, 2020, 5:33 PM James Yu <james@ispot.tv<mailto:james@ispot.tv>>
wrote:
Hi,

When I spark submit a Spark app with my app jar located in S3, obviously the Driver will download
the jar from the s3 location.  What is not clear to me is: where do the Executors get the
jar from?  From the same s3 location, or somehow from the Driver, or they don't need the jar?

Thanks in advance for explanation.

James

Mime
View raw message