spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From freedafeng <>
Subject spark's behavior about failed tasks
Date Wed, 12 Aug 2015 20:22:19 GMT
Hello there,

I have a spark running in a 20 node cluster. The job is logically simple,
just a mapPartition and then sum. The return value of the mapPartitions is
an integer for each partition. The tasks got some random failure (which
could be caused by a 3rh party key-value store connections. The cause is
irrelevant to my question). In more details,

1. spark 1.1.1. 
2. 4096 tasks total.
3. 66 failed tasks.

Spark seems rerunning all the 4096 tasks instead of the 66 failed tasks. It
current runs at 469/4096 (stage2). 

Is this behavior normal? 

Thanks for your help!

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message