spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <>
Subject [jira] [Commented] (SPARK-12583) spark shuffle fails with mesos after 2mins
Date Tue, 24 May 2016 18:33:13 GMT


Apache Spark commented on SPARK-12583:

User 'corruptmemory' has created a pull request for this issue:

> spark shuffle fails with mesos after 2mins
> ------------------------------------------
>                 Key: SPARK-12583
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle
>    Affects Versions: 1.6.0
>            Reporter: Adrian Bridgett
>            Assignee: Bertrand Bossy
>             Fix For: 2.0.0
> See user mailing list "Executor deregistered after 2mins" for more details.
> As of 1.6, the driver registers with each shuffle manager via  MesosExternalShuffleClient.
 Once this disconnects, the shuffle manager automatically cleans up the data associate with
that driver.
> However, the connection is terminated before this happens as it's idle. Looking at a
packet trace, after 120secs the shuffle manager is sending a FIN packet to the driver.   The
only way to delay this is to increase on the shuffle
> I patched the MesosExternalShuffleClient (and ExternalShuffleClient) with newbie Scala
skills to call the TransportContext call with closeIdleConnections "false" and this didn't
help (hadn't done the network trace first).

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message