spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-5418) Output directory for shuffle should consider left space of each directory set in conf
Date Tue, 27 Jan 2015 22:53:36 GMT

     [ https://issues.apache.org/jira/browse/SPARK-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean Owen resolved SPARK-5418.
------------------------------
    Resolution: Duplicate

> Output directory for shuffle should consider left space of each directory set in conf
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-5418
>                 URL: https://issues.apache.org/jira/browse/SPARK-5418
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle
>    Affects Versions: 1.2.0
>         Environment: Ubuntu, others should be similar
>            Reporter: ding
>            Priority: Minor
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> I set multiple directorys in conf spark.local.dir as "scratch" space, one of them(eg.
/mnt/disk1) have 30G left space while others(eg./mnt/disk2) have 100G. In current version,
spark use hash to figure out which directory is used for "scratch" space. It means each directory
has the same chance. After hounds of iteration of pagerank, there is "No space left" exception
and driver crashes. It does not make sense since there is still 70G+ left space in other directorys.
We should take consider left space on each directorys when figure out which directory should
be map output dir. I will send a PR for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message