spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <c...@koeninger.org>
Subject Re: How to replay consuming messages from kafka using spark streaming?
Date Thu, 15 Jan 2015 05:26:37 GMT
Take a look at the implementation linked from here

https://issues.apache.org/jira/browse/SPARK-4964

see if that would meet your needs

On Wed, Jan 14, 2015 at 9:58 PM, mykidong <mykidong@gmail.com> wrote:

> Hi,
>
> My Spark Streaming Job is doing like kafka etl to HDFS.
> For instance, every 10 min. my streaming job is retrieving messages from
> kafka, and save them as avro files onto hdfs.
> My question is, if worker fails to write avro to hdfs, sometimes, I want to
> replay consuming messages from the last succeeded kafka offset again.
> I think, Spark Streaming Kafka Receiver is written using Kafka High Level
> Consumer API, not Simple Consumer API.
>
> Any idea how to replay kafka consuming in spark streaming?
>
> - Kidong.
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-replay-consuming-messages-from-kafka-using-spark-streaming-tp21145.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message