spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tien Dat <>
Subject [SPARK on MESOS] Avoid re-fetching Spark binary
Date Fri, 06 Jul 2018 08:00:17 GMT
Dear all,

We are running Spark with Mesos as the master for resource management.
In our cluster, there are jobs that require very short response time (near
real time applications), which usually around 3-5 seconds.

In order to Spark to execute with Mesos, one has to specify the
SPARK_EXECUTOR_URI configuration, which defines the location where Mesos can
fetch the Spark binary every time it launches new job.
We noticed that the fetching and extraction of the Spark binary repeats
every time we run, even though the binary is basically the same. More
importantly, fetching and extracting this file can lead to 2-3 seconds of
latency, which is fatal for our near real-time application. Besides, after
running many Spark jobs, the Spark binary tar is cumulated and occupies a
large disk space.

As a result, we wonder if there is a workaround to avoid this fetching and
extracting process, given that the Spark binary is available locally at each
of the Mesos agent?

Please don't hesitate to ask me if you have any further information needed.
Thank you in advance.

Best regards

Sent from:

To unsubscribe e-mail:

View raw message