spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <>
Subject Re: Back-pressure for Spark Streaming
Date Fri, 08 May 2015 17:35:53 GMT
We had a similar issue while working on one of our usecase where we were
processing at a moderate throughput (around 500MB/S). When the processing
time exceeds the batch duration, it started to throw up blocknotfound
exceptions, i made a workaround for that issue and is explained over here

Basically, instead of generating blocks blindly, i made the receiver sleep
if there's an increase in the scheduling delay (if scheduling delay exceeds
3 times the batch duration). This prototype is working nicely and the speed
is encouraging as its processing at 500MB/S without having any failures so

Best Regards

On Fri, May 8, 2015 at 8:11 PM, Fran├žois Garillot <> wrote:

> Hi guys,
> We[1] are doing a bit of work on Spark Streaming, to help it face
> situations where the throughput of data on an InputStream is (momentarily)
> susceptible to overwhelm the Receiver(s) memory.
> The JIRA & design doc is here:
> We'd sure appreciate your comments !
> --
> Fran├žois Garillot
> [1]: Typesafe & some helpful collaborators on benchmarking 'at scale'

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message