hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10986) hadoop tarball is twice as big as prev. version and 6 times as big unpacked
Date Thu, 21 Aug 2014 23:02:11 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106105#comment-14106105
] 

Karthik Kambatla commented on HADOOP-10986:
-------------------------------------------

Thanks for the investigation, Alejandro. I see why this happened - the script (create-release.sh)
was doing mvn install with -Pdocs option, on top of which I copied the mvn site output as
well. We can fix the script to not create javadocs during install, and use what we get from
site. I tried this locally and the binary tarball is much smaller. 

I propose we handle this also under HADOOP-10956, and close this as a duplicate. 

> hadoop tarball is twice as big as prev. version and 6 times as big unpacked
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-10986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10986
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.5.0
>            Reporter: André Kelpe
>            Assignee: Karthik Kambatla
>            Priority: Blocker
>
> I noticed that the binary tarball for 2.5.0 is almost 300MB, while 2.4.1 is only 132MB.
Unpacking the latest tarball gives me 1.8 GB of stuff, with the majority in the "share" directory.
>  
> {code}
> $ cd hadoop-2.4.1
> $ du -sh *
> 364K    bin
> 356K    etc
> 100K    include
> 2,3M    lib
> 128K    libexec
> 24K     LICENSE.txt
> 12K     NOTICE.txt
> 12K     README.txt
> 336K    sbin
> 280M    share
> {code}
> {code}
>  $ cd hadoop-2.5.0 
>  $ du -sh *
> 512K    bin
> 332K    etc
> 100K    include
> 4,6M    lib
> 128K    libexec
> 336K    sbin
> 1,8G    share
> {code}
> I also saw some warnings from tar while unpacking:
> {code}
> $ tar xf hadoop-2.5.0.tar.gz 
> tar: Ignoring unknown extended header keyword `SCHILY.dev'
> tar: Ignoring unknown extended header keyword `SCHILY.ino'
> tar: Ignoring unknown extended header keyword `SCHILY.nlink'
> tar: Ignoring unknown extended header keyword `SCHILY.dev'
> tar: Ignoring unknown extended header keyword `SCHILY.ino'
> tar: Ignoring unknown extended header keyword `SCHILY.nlink'
> tar: Ignoring unknown extended header keyword `SCHILY.dev'
> tar: Ignoring unknown extended header keyword `SCHILY.ino'
> tar: Ignoring unknown extended header keyword `SCHILY.nlink'
> tar: Ignoring unknown extended header keyword `SCHILY.dev'
> tar: Ignoring unknown extended header keyword `SCHILY.ino'
> tar: Ignoring unknown extended header keyword `SCHILY.nlink'
> tar: Ignoring unknown extended header keyword `SCHILY.dev'
> tar: Ignoring unknown extended header keyword `SCHILY.ino'
> tar: Ignoring unknown extended header keyword `SCHILY.nlink'
> tar: Ignoring unknown extended header keyword `SCHILY.dev'
> tar: Ignoring unknown extended header keyword `SCHILY.ino'
> tar: Ignoring unknown extended header keyword `SCHILY.nlink'
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message