spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juan Rodríguez Hortalá <juan.rodriguez.hort...@gmail.com>
Subject Push epoch updates to executors on fetch failure to avoid fetch retries for missing executors
Date Wed, 25 Oct 2017 17:06:42 GMT
Hi,

I opened https://issues.apache.org/jira/browse/SPARK-22339 some days ago,
and I would like to get some feedback on that. The idea is pushing epoch
updates to the executors after a fetch failure by piggybacking on the
executor heartbeat response, in order to fail faster when an executor and
their shuffle blocks are lost, instead of having to wait for all fetch
retries to fail and a new task to be started on the reader executors. This
can speed up job execution, particularly when executors are lost at the end
of an stage in a Spark application with a single action at a time.There are
more details and a draft patch for this in the JIRA.

Looking forward for your feedback on this.

Greetings,

Juan

Mime
View raw message