hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mahadev konar (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6467) Performance improvement for liststatus on directories in hadoop archives.
Date Thu, 28 Jan 2010 22:21:34 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806131#action_12806131
] 

Mahadev konar commented on HADOOP-6467:
---------------------------------------

doug,
most of the metadata that is used in archives is plain text. For later uses when we have more
advanced archives avro would surely be suitable. For this jira, the goal is to make archives
performant and be usable on a regular basis by users.

> Performance improvement for liststatus on directories in hadoop archives.
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6467
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6467
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 0.22.0
>
>         Attachments: Archives_performance.docx
>
>
> A liststatus call on a directory in hadoop archives leads to ( 2* number of files in
directory) open calls to the namenode. This is very sub optimal and needs to be fixed to make
it performant enough to be used on a daily basis. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message