spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Smith <secs...@gmail.com>
Subject Re: Low Level Kafka Consumer for Spark
Date Mon, 15 Sep 2014 17:03:50 GMT
Hi Dibyendu,

I am a little confused about the need for rate limiting input from
kafka. If the stream coming in from kafka has higher message/second
rate than what a Spark job can process then it should simply build a
backlog in Spark if the RDDs are cached on disk using persist().
Right?

Thanks,

Tim


On Mon, Sep 15, 2014 at 4:33 AM, Dibyendu Bhattacharya
<dibyendu.bhattachary@gmail.com> wrote:
> Hi Alon,
>
> No this will not be guarantee that same set of messages will come in same
> RDD. This fix just re-play the messages from last processed offset in same
> order. Again this is just a interim fix we needed to solve our use case . If
> you do not need this message re-play feature, just do not perform the ack (
> Acknowledgement) call in the Driver code. Then the processed messages will
> not be written to ZK and hence replay will not happen.
>
> Regards,
> Dibyendu
>
> On Mon, Sep 15, 2014 at 4:48 PM, Alon Pe'er <alon.p@supersonicads.com>
> wrote:
>>
>> Hi Dibyendu,
>>
>> Thanks for your great work!
>>
>> I'm new to Spark Streaming, so I just want to make sure I understand
>> Driver
>> failure issue correctly.
>>
>> In my use case, I want to make sure that messages coming in from Kafka are
>> always broken into the same set of RDDs, meaning that if a set of messages
>> are assigned to one RDD, and the Driver dies before this RDD is processed,
>> then once the Driver recovers, the same set of messages are assigned to a
>> single RDD, instead of arbitrarily repartitioning the messages across
>> different RDDs.
>>
>> Does your Receiver guarantee this behavior, until the problem is fixed in
>> Spark 1.2?
>>
>> Regards,
>> Alon
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p14233.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message