spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Svend <>
Subject Systematic error when re-starting Spark stream unless I delete all checkpoints
Date Thu, 25 Sep 2014 14:20:40 GMT
I experience spark streaming restart issues similar to what is discussed in
the 2 threads below (in which I failed to find a solution). Could anybody
let me know if anything is wrong in the way I start/stop or if this could be
a spark bug?

My stream reads a Kafka topic, does some processing involving an
updatStateByKey and saves the result to HDFS. 

The context is (re)-created at startup as follows: 

And the start-up and shutdown of the stream is handled as follows: 

When starting the stream for the first time (with spark-submit), the
processing happens successfully, folders are created on the target HDFS
folder and streaming stats are visible on http://sparkhost:4040/streaming.

After letting the streaming work several minutes and then stopping it
(ctrl-c on the command line), the following info is visible in the
checkpoint folder: 

(checkpoint clean-up seems to happen since the stream ran for much more than
5 times 10 seconds)

When re-starting the stream, the startup fails with the error below,
http://sparkhost:4040/streaming shows no statistics, no new HDFS folder is
added in the target folder and no new checkpoint are created: 

Now if I delete all older checkpoints and keep only the most recent one: 

I end up with this (kafka?) actor non unique name error. 

If I delete the checkpoint folder the stream starts successfully (but I lose
my ongoing stream state, obviously)

We're running spark 1.1.0 on Mesos 0.20. Our spark assembly is packaged with
CDH 5.1.0 and Hive: 

Any comment or suggestion would be greatly appreciated.

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message