spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Nadler <jnad...@srcginc.com>
Subject Re: Streaming Backpressure with Multiple Streams
Date Sat, 10 Sep 2016 00:54:52 GMT
Yes I'll test that next.

On Sep 9, 2016 5:36 PM, "Cody Koeninger" <cody@koeninger.org> wrote:

> Does the same thing happen if you're only using direct stream plus back
> pressure, not the receiver stream?
>
> On Sep 9, 2016 6:41 PM, "Jeff Nadler" <jnadler@srcginc.com> wrote:
>
>> Maybe this is a pretty esoteric implementation, but I'm seeing some bad
>> behavior with backpressure plus multiple Kafka streams / direct streams.
>>
>> Here's the scenario:
>> We have 1 Kafka topic using the reliable receiver (4 receivers, union the
>> result).    In the same app, we consume another Kafka topic using a direct
>> stream.
>>
>> This may seem strange, but it's necessary in my application to work
>> around another problem:   Maxrate is set globally in SparkConf.    IMO It
>> would be more flexible if we could set maxrate for each stream
>> independently.   Since directstream uses a different config parameter for
>> maxrate, we get the desired result.
>>
>> A bit hacky I know.
>>
>> Anyway, we recently turned on backpressure.   It works as expected for
>> the receiver-based stream.     For the direct stream, it starts out at the
>> maxrate (as expected) on the first batch.    Then it ratchets down the
>> consumption until it is eventually consuming 1 record / second / partition.
>>
>> This happens even though there's no scheduling delay, and the
>> receiver-based stream does not appear to be throttled.
>>
>> Anyone ever see anything like this?
>>
>> Thanks!
>>
>> Jeff Nadler
>> Aerohive Networks
>>
>>

Mime
View raw message