spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Max Xu <Max...@twosigma.com>
Subject streaming application throws IOException due to Log directory already exists during checkpoint recovery
Date Wed, 07 Jan 2015 16:55:57 GMT
Hi All,

I run a Spark streaming application (Spark 1.2.0) on YARN (Hadoop 2.5.2) with Spark event
log enabled. I set the checkpoint dir in the streaming context and run the app. It started
in YARN with application id 'app_id_1' and created the Spark event log dir /spark/applicationHistory/app_id_1.
I killed the app and rerun it with the same checkpoint dir, this time it had a different YARN
application id  'app_id_2'. However, rerun failed due to Log directory already exists:

Exception in thread "Driver" java.io.IOException: Log directory hdfs://xxx:8020/spark/applicationHistory/app_id_1
already exists!
        at org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:129)
        at org.apache.spark.util.FileLogger.start(FileLogger.scala:115)
        at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:74)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:353)
        at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:118)
        at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:561)
        at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:561)
        at scala.Option.map(Option.scala:145)
        at org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:561)
        at org.apache.spark.streaming.api.java.JavaStreamingContext$.getOrCreate(JavaStreamingContext.scala:566)
        at org.apache.spark.streaming.api.java.JavaStreamingContext.getOrCreate(JavaStreamingContext.scala)
        at com.xxx.spark.streaming.JavaKafkaSparkHbase.main(JavaKafkaSparkHbase.java:121)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:427)


Is this an expected behavior? When recoverying from the checkpoint, shouldn't an event log
dir with the name of a new application id created (in the above example, rerun should create
/spark/applicationHistory/app_id_2)?

Thanks,
Max

Mime
View raw message