Hi all,

I keep experimenting with recovering from checkpoint and I noticed that if I comment out the code which created my stream and try to recover from checkpoint, I get an error:
13/09/19 16:38:16 ERROR CheckpointReader: Error loading checkpoint from file '/Temp/spark-checkpoint/graph'
java.lang.ClassNotFoundException: cpc.vsw.com.Main$$anonfun$1

Please note, that the code which created the stream is not used, because when spark recovers from checkpoint, it recreates all streams which existed in the previous run. If you try to create the stream the same way you created it initially, you will get a conflict error:
  13/09/19 16:40:49 ERROR ActorReceiver: Error receiving data
  akka.actor.InvalidActorNameException:actor name NetworkReceiver-0 is not unique!
which is expected, because deserialized akka (and other objects) are not created by usual means and will violate many constraints such as name uniqueness for example.


My question is:
If I want to release new version of my stream processing application (add/remove/change parameters) of my InputDStreams, how do I deploy the newly compiled application so that it restores correctly?

It seems more control over (de)serialization process is needed. For example, it would be useful to define some old streams as deleted to ignore them during deserialization. And a little bit more transparency over what exactly is being serialized would be great. I have no idea what cpc.vsw.com.Main$$anonfun$1 is and I'll have to decompile it to figure it out. If I add one more stream definition to my code, will it create a new Main$$anonfun$1 and the old one become Main$$anonfun$2 ? Will deserializer mess it up by assigning old RDD to wrong stream?
Also I would like to redefine params of my streams, for example if I want to change "checkpoint(Minutes(1))" to something else in my code, it will be ignored, because all streams and their params will be read from checkpoint and there is no way for me to associate deserialized stream with the code which sets its parameters.


Cheers,
Vadim.

--
From RFC 2631: In ASN.1, EXPLICIT tagging is implicit unless IMPLICIT is explicitly specified