spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Niehoff <matthias.nieh...@codecentric.de>
Subject Dynamic Resource Allocation with Spark Streaming (Standalone Cluster, Spark 1.5.1)
Date Mon, 26 Oct 2015 20:00:33 GMT
Hello everybody,

I have a few (~15) Spark Streaming jobs which have load peaks as well as
long times with a low load. So I thought the new Dynamic Resource
Allocation for Standalone Clusters might be helpful (SPARK-4751).

I have a test "cluster" with 1 worker consisting of 4 executors with 2
cores each, so 8 cores in total.

I started a simple streaming application without limiting the max cores for
this app. As expected the app occupied every core of the cluster. Then I
started a second app, also without limiting the maximum cores. As the first
app did not get any input through the stream, my naive expectation was that
the second app would get at least 2 cores (1 receiver, 1 processing), but
that's not what happened. The cores are still assigned to the first app.
When I look at the application UI of the first app every executor is still
running. That explains why no executor is used for the second app.

I end up with two questions:
- When does an executor getting idle in a Spark Streaming application? (and
so could be reassigned to another app)
- Is there another way to compete with uncertain load when using Spark
Streaming Applications? I already combined multiple jobs to a Spark
Application using different threads, but this approach comes to a limit for
me, because Spark Applications get to big to manage.

Thank You!

Mime
View raw message