hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandy Ryza (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-4933) MR1 merger asks for length of file it just wrote before flushing it
Date Thu, 10 Jan 2013 21:30:12 GMT
Sandy Ryza created MAPREDUCE-4933:
-------------------------------------

             Summary: MR1 merger asks for length of file it just wrote before flushing it
                 Key: MAPREDUCE-4933
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4933
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mrv1, task
    Affects Versions: 1.1.1
            Reporter: Sandy Ryza
            Assignee: Sandy Ryza


createKVIterator in ReduceTask contains the following code:
{code}

          try {
            Merger.writeFile(rIter, writer, reporter, job);
            addToMapOutputFilesOnDisk(fs.getFileStatus(outputPath));
          } catch (Exception e) {
            if (null != outputPath) {
              fs.delete(outputPath, true);
            }
            throw new IOException("Final merge failed", e);
          } finally {
            if (null != writer) {
              writer.close();
            }
          }
{code}

Merger#writeFile() does not close the file after writing it, so when fs.getFileStatus() is
called on it, it may not return the correct length.  This causes bad accounting further down
the line, which can lead to map output data being lost.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message