spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kanwaldeep <kanwal...@gmail.com>
Subject Spark Streaming : Could not compute split, block not found
Date Fri, 01 Aug 2014 17:55:24 GMT
We are using Sparks 1.0 for Spark Streaming on Spark Standalone cluster and
seeing the following error.

	Job aborted due to stage failure: Task 3475.0:15 failed 4 times, most
recent failure: Exception failure in TID 216394 on host
hslave33102.sjc9.service-now.com: java.lang.Exception: Could not compute
split, block input-0-1406869340000 not found
org.apache.spark.rdd.BlockRDD.compute(BlockRDD.scala:51) 

We are using the Memory_DISK serialization option for the input streams. And
the stream is also being persisted since we have multiple transformations
happening on the input stream.


    val lines = KafkaUtils.createStream[String, Array[Byte], StringDecoder,
DefaultDecoder](ssc, kafkaParams, topicpMap,
StorageLevel.MEMORY_AND_DISK_SER)

    lines.persist(StorageLevel.MEMORY_AND_DISK_SER)

We are aggregating data every 15 minutes as well as an hour. The
spark.streaming.blockInterval=10000 so we minimize the blocks of data read.

The problem started at the 15 minute interval but now I'm seeing it happen
every hour since last night.

Any suggestions?

Thanks
Kanwal



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-tp11186.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message