spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Morin <morin.david....@gmail.com>
Subject Spark 3.0.1 Structured streaming - checkpoints fail
Date Wed, 23 Dec 2020 16:07:10 GMT
Hello,

I have an issue with my Pyspark job related to checkpoint.

Caused by: org.apache.spark.SparkException: Job aborted due to stage
failure: Task 3 in stage 16997.0 failed 4 times, most recent failure: Lost
task 3.3 in stage 16997.0 (TID 206609, 10.XXX, executor 4):
java.lang.IllegalStateException: Error reading delta file
file:/opt/spark/workdir/query6/checkpointlocation/state/0/3/1.delta of
HDFSStateStoreProvider[id = (op=0,part=3),dir =
file:/opt/spark/workdir/query6/checkpointlocation/state/0/3]:
*file:/opt/spark/workdir/query6/checkpointlocation/state/0/3/1.delta
does not exist*

This job is based on Spark 3.0.1 and Structured Streaming
This Spark cluster (1 driver and 6 executors) works without hdfs. And we
don't want to manage an hdfs cluster if possible.
Is it necessary to have a distributed filesystem ? What are the different
solutions/workarounds ?

Thanks in advance
David

Mime
View raw message