spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Rosen (JIRA)" <>
Subject [jira] [Commented] (SPARK-16830) Executors Keep Trying to Fetch Blocks from a Bad Host
Date Thu, 22 Sep 2016 19:46:20 GMT


Josh Rosen commented on SPARK-16830:

Do you have stacktraces from the failed block fetches? I'd like to see whether this may be
fixed by a recent patch of mine which helps to avoid failures if all locations of non-shuffle
blocks are lost / unavailable.

> Executors Keep Trying to Fetch Blocks from a Bad Host
> -----------------------------------------------------
>                 Key: SPARK-16830
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, Streaming
>    Affects Versions: 1.6.2
>         Environment: EMR 4.7.2
>            Reporter: Renxia Wang
> When a host became unreachable, driver removes the executors and block managers on that
hosts because it doesn't receive heartbeats. However, executors on other hosts still keep
trying to fetch blocks from the bad hosts. 
> I am running a Spark Streaming job to consume data from Kinesis. As a result of this
block fetch retrying and failing, I started seeing ProvisionedThroughputExceededException
on shards, AmazonHttpClient (to Kinesis) SocketException, Kinesis ExpiredIteratorException
> This issue also expose a potential memory leak. Starting from the time that the bad host
became unreachable, the physical memory usages of executors that keep trying to fetch block
from the bad host started increasing and finally hit the physical memory limit and killed
by YARN. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message