hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6849) Have LocalDirAllocator.AllocatorPerContext.getLocalPathForWrite fail more meaningfully
Date Mon, 05 Jul 2010 11:44:49 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885187#action_12885187

Steve Loughran commented on HADOOP-6849:

Obviously, building up an error string during the directory search is a waste of effort for
every successful operation, so the diagnostics should only be created on failure. A list of
directories and the target file size may be enough, as it would catch directory config issues.

> Have LocalDirAllocator.AllocatorPerContext.getLocalPathForWrite fail more meaningfully
> --------------------------------------------------------------------------------------
>                 Key: HADOOP-6849
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6849
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.20.2
>            Reporter: Steve Loughran
>            Priority: Minor
> A stack trace makes it way to me, of a reduce failing
> {code}
> Caused by: org\.apache\.hadoop\.util\.DiskChecker$DiskErrorException: Could not find
any valid local directory for file:/mnt/data/dfs/data/mapred/local/taskTracker/jobcache/job_201007011427_0001/attempt_201007011427_0001_r_000000_1/output/map_96\.out
>       at org\.apache\.hadoop\.fs\.LocalDirAllocator$AllocatorPerContext\.getLocalPathForWrite(LocalDirAllocator\.java:343)
>       at org\.apache\.hadoop\.fs\.LocalDirAllocator\.getLocalPathForWrite(LocalDirAllocator\.java:124)
>       at org\.apache\.hadoop\.mapred\.ReduceTask$ReduceCopier$LocalFSMerger\.run(ReduceTask\.java:2434)
> {code}
> We're probably running out of HDD space, if not its configuration problems. Either way,
some more hints in the exception would be handy.
> # Include the size of the output file looked for if known
> # Include the list of dirs examined and their reason for rejection (not found or if not
enough room, available space).
> This would make it easier to diagnose problems after the event, with nothing but emailed
logs for diagnostics.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message