spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix Cheung <felixcheun...@hotmail.com>
Subject Re: spark.submit.deployMode: cluster
Date Thu, 28 Mar 2019 16:42:29 GMT
If anyone wants to improve docs please create a PR.

lol


But seriously you might want to explore other projects that manage job submission on top of
spark instead of rolling your own with spark-submit.


________________________________
From: Pat Ferrel <pat@occamsmachete.com>
Sent: Tuesday, March 26, 2019 2:38 PM
To: Marcelo Vanzin
Cc: user
Subject: Re: spark.submit.deployMode: cluster

Ahh, thank you indeed!

It would have saved us a lot of time if this had been documented. I know, OSS so contributions
are welcome… I can also imagine your next comment; “If anyone wants to improve docs see
the Apache contribution rules and create a PR.” or something like that.

BTW the code where the context is known and can be used is what I’d call a Driver and since
all code is copied to nodes and is know in jars, it was not obvious to us that this rule existed
but it does make sense.

We will need to refactor our code to use spark-submit it appears.

Thanks again.


From: Marcelo Vanzin <vanzin@cloudera.com><mailto:vanzin@cloudera.com>
Reply: Marcelo Vanzin <vanzin@cloudera.com><mailto:vanzin@cloudera.com>
Date: March 26, 2019 at 1:59:36 PM
To: Pat Ferrel <pat@occamsmachete.com><mailto:pat@occamsmachete.com>
Cc: user <user@spark.apache.org><mailto:user@spark.apache.org>
Subject:  Re: spark.submit.deployMode: cluster

If you're not using spark-submit, then that option does nothing.

If by "context creation API" you mean "new SparkContext()" or an
equivalent, then you're explicitly creating the driver inside your
application.

On Tue, Mar 26, 2019 at 1:56 PM Pat Ferrel <pat@occamsmachete.com<mailto:pat@occamsmachete.com>>
wrote:
>
> I have a server that starts a Spark job using the context creation API. It DOES NOY use
spark-submit.
>
> I set spark.submit.deployMode = “cluster”
>
> In the GUI I see 2 workers with 2 executors. The link for running application “name”
goes back to my server, the machine that launched the job.
>
> This is spark.submit.deployMode = “client” according to the docs. I set the Driver
to run on the cluster but it runs on the client, ignoring the spark.submit.deployMode.
>
> Is this as expected? It is documented nowhere I can find.
>


--
Marcelo

Mime
View raw message