[ https://issues.apache.org/jira/browse/SPARK-15429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292908#comment-15292908 ] Albert Cheng edited comment on SPARK-15429 at 5/20/16 8:02 AM: --------------------------------------------------------------- I have a idea about this issue. First, add a new parameter 'concurrentJobs' to PIDRateEstimator. Second, We can change the `error = latestRate - processingRate` to `error = latestRate - processingRate * concurrentJobs.toDouble`. And change the `historicalError = schedulingDelay.toDouble * processingRate / batchIntervalMillis` to `historicalError = schedulingDelay.toDouble * processingRate * concurrentJobs.toDouble / batchIntervalMillis`. Is it right? I would like to fix this. was (Author: cq365423762): I have a idea about this issue. First, add a new parameter `concurrentJobs` to PIDRateEstimator. Second, We can change the `error = latestRate - processingRate` to `error = latestRate - processingRate * concurrentJobs.toDouble`. And change the `historicalError = schedulingDelay.toDouble * processingRate / batchIntervalMillis` to `historicalError = schedulingDelay.toDouble * processingRate * concurrentJobs.toDouble / batchIntervalMillis`. Is it right? I would like to fix this. > When `spark.streaming.concurrentJobs > 1`, PIDRateEstimator cannot estimate the receiving rate accurately. > ---------------------------------------------------------------------------------------------------------- > > Key: SPARK-15429 > URL: https://issues.apache.org/jira/browse/SPARK-15429 > Project: Spark > Issue Type: Bug > Components: Streaming > Affects Versions: 1.6.1 > Reporter: Albert Cheng > > When `spark.streaming.concurrentJobs > 1`, PIDRateEstimator cannot estimate the receiving rate accurately. > For example, if the batch duration is set to 10 seconds, each rdd in the dstream will take 20s to compute. By changing `spark.streaming.concurrentJobs=2`, each rdd in the dstream still takes 20s to consume the data, which leads to poor estimation of backpressure by PIDRateEstimator. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org