So is there any documents demonstrating in what condition can my application recover from the same checkpoint and in what condition not?

Tathagata Das <tathagata.das1565@gmail.com>于2017年8月30日周三 下午1:20写道:
Hello, 

This is an unfortunate design on my part when I was building DStreams :) 

Fortunately, we learnt from our mistakes and built Structured Streaming the correct way. Checkpointing in Structured Streaming stores only the progress information (offsets, etc.), and the user can change their application code (within certain constraints, of course) and still restart from checkpoints (unlike DStreams). If you are just building out your streaming applications, then I highly recommend you to try out Structured Streaming instead of DStreams (which is effectively in maintenance mode).


On Fri, Aug 25, 2017 at 7:41 PM, Hugo Reinwald <hugo.reinwald@gmail.com> wrote:
Hello,

I am implementing a spark streaming solution with Kafka and read that checkpoints cannot be used across application code changes - here

I tested changes in application code and got the error message as b below - 

17/08/25 15:10:47 WARN CheckpointReader: Error reading checkpoint from file file:/tmp/checkpoint/checkpoint-1503641160000.bk
java.io.InvalidClassException: scala.collection.mutable.ArrayBuffer; local class incompatible: stream classdesc serialVersionUID = -2927962711774871866, local class serialVersionUID = 1529165946227428979 

While I understand that this is as per design, can I know why does checkpointing work the way that it does verifying the class signatures? Would it not be easier to let the developer decide if he/she wants to use the old checkpoints depending on what is the change in application logic e.g. changes in code unrelated to spark/kafka - Logging / conf changes etc

This is first post in the group. Apologies if I am asking the question again, I did a nabble search and it didnt throw up the answer.

Thanks for the help.
Hugo