hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jay vyas (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-5902) JobHistoryServer needs more debug logs.
Date Fri, 23 May 2014 04:14:01 GMT
jay vyas created MAPREDUCE-5902:

             Summary: JobHistoryServer needs more debug logs.
                 Key: MAPREDUCE-5902
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5902
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: jobhistoryserver
            Reporter: jay vyas

With the JobHistory Server , it appears that its possible sometimes to skip over certain history
files.  I havent been able to determine why yet, but I've found that some long named .jhist
files aren't getting collected into the done/ directory.

After tracing some in the actual source, and turning on DEBUG level logging, it became clear
that this snippet is an important workhorse (scanDirectoryForIntermediateFiles, and scanDirectoryForHistoryFiles
ultimately boil down to scanDirectory()).  

It would be extremely useful , then, to have a couple of gaurded logs at this level of the
code, so that we can see, in the log folders, why files are being filtered out  , i.e. it
is due to filterint or visibility.


  private static List<FileStatus> scanDirectory(Path path, FileContext fc,
      PathFilter pathFilter) throws IOException {
    path = fc.makeQualified(path);
    List<FileStatus> jhStatusList = new ArrayList<FileStatus>();
    RemoteIterator<FileStatus> fileStatusIter = fc.listStatus(path);
    while (fileStatusIter.hasNext()) {
      FileStatus fileStatus = fileStatusIter.next();
      Path filePath = fileStatus.getPath();
      if (fileStatus.isFile() && pathFilter.accept(filePath)) {
    return jhStatusList;


This message was sent by Atlassian JIRA

View raw message