spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Palaniappan <>
Subject [Streaming][Structured Streaming] Understanding dynamic allocation in streaming jobs
Date Tue, 22 Aug 2017 19:24:48 GMT
I'm trying to understand dynamic allocation in Spark Streaming and Structured Streaming. It
seems if you set spark.dynamicAllocation.enabled=true, both frameworks use Core's dynamic
allocation algorithm -- request executors if the task backlog is a certain size, and remove
executors if they idle for a certain period of time.

However, as this Cloudera post points out, that algorithm doesn't really make sense for streaming: When
writing a toy streaming job, I did run into the issue where executors are never removed. Cloudera's
suggestion of turning off dynamic allocation seems unreasonable -- Spark applications should
grow/shrink to match their workload.

I see that Spark Streaming has its own (undocumented) configuration for dynamic allocation: Is that actually a supported feature? Or
was that just an experiment? I had trouble getting this to work, but I'll follow up in a different

Also, does Structured Streaming have its own dynamic allocation algorithm?


Karthik Palaniappan

View raw message