spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhenyu Hu <>
Subject Spark DStream Dynamic Allocation
Date Thu, 12 Aug 2021 02:33:29 GMT
1. First of all, I would like to ask whether the dynamic scaling of Spark
DStream is available now? It is not mentioned in the Spark documentation
2. Spark DStream dynamic scaling will randomly kill a non-receiver executor
when the average processing delay divided by the batch processing interval
is less than 0.5. But this may cause the executor to lose the cache or
shuffle data, how to deal with this situation
3. If WindowedDStream exists, the job batch will be triggered according to
slidingDuration of WindowedDStream, but the dynamic scaling of DStream is
still based on the processing delay of each job batch divided by
BatchDuration. Is this reasonable? I think the ratio should be calculated
by dividing the job processing delay by SlidingDuration

View raw message