spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sudhir Babu Pothineni <sbpothin...@gmail.com>
Subject Re: Zero Data Loss in Spark with Kafka
Date Tue, 23 Aug 2016 15:45:05 GMT
saving offsets to zookeeper is old approach, check-pointing internally
saves the offsets to HDFS/location of checkpointing.

more details here:
http://spark.apache.org/docs/latest/streaming-kafka-integration.html

On Tue, Aug 23, 2016 at 10:30 AM, KhajaAsmath Mohammed <
mdkhajaasmath@gmail.com> wrote:

> Hi Experts,
>
> I am looking for some information on how to acheive zero data loss while
> working with kafka and Spark. I have searched online and blogs have
> different answer. Please let me know if anyone has idea on this.
>
> Blog 1:
> https://databricks.com/blog/2015/01/15/improved-driver-
> fault-tolerance-and-zero-data-loss-in-spark-streaming.html
>
>
> Blog2:
> http://aseigneurin.github.io/2016/05/07/spark-kafka-
> achieving-zero-data-loss.html
>
>
> Blog one simply says configuration change with checkpoint directory and
> blog 2 give details about on how to save offsets to zoo keeper. can you
> please help me out with right approach.
>
> Thanks,
> Asmath
>
>
>

Mime
View raw message