spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <c...@koeninger.org>
Subject Re: Restart App and consume from checkpoint using direct kafka API
Date Thu, 31 Mar 2016 14:56:58 GMT
Long story short, no.  Don't rely on checkpoints if you cant handle
reprocessing some of your data.

On Thu, Mar 31, 2016 at 3:02 AM, Imre Nagi <imre.nagi2812@gmail.com> wrote:
> I'm dont know how to read the data from the checkpoint. But AFAIK and based
> on my experience, I think the best thing that you can do is storing the
> offset to a particular storage such as database everytime you consume the
> message. Then read the offset from the database everytime you want to start
> reading the message.
>
> nb: This approach is also explained by Cody in his blog post.
>
> Thanks
>
> On Thu, Mar 31, 2016 at 2:14 PM, vimal dinakaran <vimal3271@gmail.com>
> wrote:
>>
>> Hi,
>>  In the blog
>> https://github.com/koeninger/kafka-exactly-once/blob/master/blogpost.md
>>
>> It is mentioned that enabling checkpoint works as long as the app jar is
>> unchanged.
>>
>> If I want to upgrade the jar with the latest code and consume from kafka
>> where it was stopped , how to do that ?
>> Is there a way to read the binary object of the checkpoint during init and
>> use that to start from offset ?
>>
>> Thanks
>> Vimal
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message