hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ledion bitincka (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-1440) Yarn aggregated logs should be stored in a simpler format
Date Fri, 22 Nov 2013 20:39:35 GMT
ledion bitincka created YARN-1440:
-------------------------------------

             Summary: Yarn aggregated logs should be stored in a simpler format
                 Key: YARN-1440
                 URL: https://issues.apache.org/jira/browse/YARN-1440
             Project: Hadoop YARN
          Issue Type: Improvement
            Reporter: ledion bitincka


The log aggregation feature in Yarn is awesome! However, the file type and format in which
the log files are aggregated into (TFile) should either be much simpler or be made pluggable.
The current TFile format forces anyone who wants to see the files to either 
a) use the web UI
b) use the CLI tools (yarn logs)  or 
c) write custom code to read the files 

My suggestion would be to simplify the log collection by collecting and writing the raw log
files into a directory structure as follows: 

/{log-collection-dir}/{app-id}/{container-id}/{log-file-name} 

This way the application developers can (re)use a much wider array of tools to process the
logs. 

For the readers who are not familiar with logs and their format you can find more info the
following two blog posts:
http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/
http://blogs.splunk.com/2013/11/18/hadoop-2-0-rant/



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message