spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tobias Pfeiffer <...@preferred.jp>
Subject Re: Multi-tenancy for Spark (Streaming) Applications
Date Thu, 11 Sep 2014 08:12:13 GMT
Hi,

by now I understood maybe a bit better how spark-submit and YARN play
together and how Spark driver and slaves play together on YARN.

Now for my usecase, as described on <
https://spark.apache.org/docs/latest/submitting-applications.html>, I would
probably have a end-user-facing gateway that submits my Spark (Streaming)
application to the YARN cluster in yarn-cluster mode.

I have a couple of questions regarding that setup:
* That gateway does not need to be written in Scala or Java, it actually
has no contact with the Spark libraries; it is just executing a program on
the command line ("./spark-submit ..."), right?
* Since my application is a streaming application, it won't finish by
itself. What is the best way to terminate the application on the cluster
from my gateway program? Can I just send SIGTERM to the spark-submit
program? Is it recommended?
* I guess there are many possibilities to achieve that, but what is a good
way to send commands/instructions to the running Spark application? If I
want to push some commands from the gateway to the Spark driver, I guess I
need to get its IP address - how? If I want the Spark driver to pull its
instructions, what is a good way to do so? Any suggestions?

Thanks,
Tobias

Mime
View raw message