spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mykidong <>
Subject How to replay consuming messages from kafka using spark streaming?
Date Thu, 15 Jan 2015 03:58:46 GMT

My Spark Streaming Job is doing like kafka etl to HDFS.
For instance, every 10 min. my streaming job is retrieving messages from
kafka, and save them as avro files onto hdfs. 
My question is, if worker fails to write avro to hdfs, sometimes, I want to
replay consuming messages from the last succeeded kafka offset again. 
I think, Spark Streaming Kafka Receiver is written using Kafka High Level
Consumer API, not Simple Consumer API.

Any idea how to replay kafka consuming in spark streaming?

- Kidong.

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message