spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hugo Reinwald <hugo.reinw...@gmail.com>
Subject Why do checkpoints work the way they do?
Date Sat, 26 Aug 2017 02:41:51 GMT
Hello,

I am implementing a spark streaming solution with Kafka and read that
checkpoints cannot be used across application code changes - here
<https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html>

I tested changes in application code and got the error message as b below -

17/08/25 15:10:47 WARN CheckpointReader: Error reading checkpoint from file
file:/tmp/checkpoint/checkpoint-1503641160000.bk
java.io.InvalidClassException: scala.collection.mutable.ArrayBuffer; local
class incompatible: stream classdesc serialVersionUID =
-2927962711774871866, local class serialVersionUID = 1529165946227428979

While I understand that this is as per design, can I know why does
checkpointing work the way that it does verifying the class signatures?
Would it not be easier to let the developer decide if he/she wants to use
the old checkpoints depending on what is the change in application logic
e.g. changes in code unrelated to spark/kafka - Logging / conf changes etc

This is first post in the group. Apologies if I am asking the question
again, I did a nabble search and it didnt throw up the answer.

Thanks for the help.
Hugo

Mime
View raw message