spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dibyendu Bhattacharya <dibyendu.bhattach...@gmail.com>
Subject Re: Low Level Kafka Consumer for Spark
Date Mon, 15 Sep 2014 11:33:49 GMT
Hi Alon,

No this will not be guarantee that same set of messages will come in same
RDD. This fix just re-play the messages from last processed offset in same
order. Again this is just a interim fix we needed to solve our use case .
If you do not need this message re-play feature, just do not perform the
ack ( Acknowledgement) call in the Driver code. Then the processed messages
will not be written to ZK and hence replay will not happen.

Regards,
Dibyendu

On Mon, Sep 15, 2014 at 4:48 PM, Alon Pe'er <alon.p@supersonicads.com>
wrote:

> Hi Dibyendu,
>
> Thanks for your great work!
>
> I'm new to Spark Streaming, so I just want to make sure I understand Driver
> failure issue correctly.
>
> In my use case, I want to make sure that messages coming in from Kafka are
> always broken into the same set of RDDs, meaning that if a set of messages
> are assigned to one RDD, and the Driver dies before this RDD is processed,
> then once the Driver recovers, the same set of messages are assigned to a
> single RDD, instead of arbitrarily repartitioning the messages across
> different RDDs.
>
> Does your Receiver guarantee this behavior, until the problem is fixed in
> Spark 1.2?
>
> Regards,
> Alon
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p14233.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message