spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Silvio Fiorito <>
Subject RE: Several applications share the same Spark executors (or their cache)
Date Thu, 08 Jan 2015 12:47:55 GMT
Rather than having duplicate Spark apps and the web app having a direct reference  to the SparkContext,
why not use a queue or message bus to submit your requests. This way you're not wasting resources
caching the same data in Spark and you can scale your web tier independently of the Spark
From: preeze<>
Sent: ‎1/‎8/‎2015 5:59 AM
Subject: Several applications share the same Spark executors (or their cache)

Hi all,

We have a web application that connects to a Spark cluster to trigger some
calculation there. It also caches big amount of data in the Spark executors'

To meet high availability requirements we need to run 2 instances of our web
application on different hosts. Doing this straightforward will mean that
the second application fires another set of executors that will initialize
their own huge cache totally identical to that for the first application.

Ideally we would like to reuse the cache in Spark for the needs of all
instances of our applications.

I am aware of the possibility to use Tachyon to externalize executors'
cache. Currently exploring other options.

Is there any way to allow several instances of the same application to
connect to the same set of Spark executors?

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message