hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6255) Create an rpm integration project
Date Mon, 14 Feb 2011 22:41:02 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994537#comment-12994537

Allen Wittenauer commented on HADOOP-6255:

Eric> After hadoop switching to maven, it will be more valuable to have rpm testing 
Eric> in the integration test phase.

Why would maven make a difference?

Eric> I guess you prefer to have /usr/local/etc/hadoop host the file, 
Eric> and /usr/local/hadoop/conf symlink to /usr/local/etc/hadoop.  Right?


Eric> rpm allows write of the location at installation time by using --relocate directive.
Eric> Debian does not support relocation, hence it needs to be controlled at compile time.

Most packaging systems that at least I'm familiar with don't allow RPM's level of relocation.
 This is good and bad.  In our case, it sure makes it seem like we need to build hadoop-config.sh
at install-time, at least in RPMs.

Eric>  I will expose the build time parameters in the next patch.


Owen> Allen and Steve, I believe that the proposed layout follows the redhat & 
Owen> debian guidelines where all of the arch dependent files go in to $prefix/lib 
Owen> and the arch independent files go into $prefix/share.

ObDisclosure/Rant: I don't think FHS is 100% the right way to do things 100% of the time.
My particular beef is that I'm not a fan of relatively hefty applications that are typically
running on dedicated boxes (Fedora/389 Directory Server, I'm looking at you) strictly following
the FHS--and thus scattering files all over the file system. It almost always makes upgrading
in place HARD, from both a user and a developer perspective.  In the case of Hadoop, it always
made sense to me to have a single, consolidated directory because it hits my 'dedicated box'
criteria.  However, I'm trying to keep an open mind on this one... 

So, anyway, on to jars. I dug into this a bit more.  jar files in share is a mixed bag.  On
a pure technical level, jar files are architecture-independent and would therefore qualify
to go to share.  But by FHS rules, it looks like lib is just as valid: 

"/usr/lib includes object files, libraries, and internal binaries that are not intended to
be executed directly by users or shell scripts."

(if we keep in mind that java is reading the jar files, not the shell scripts)

Doing a quick pass through the various OSes I have laying about, I'm finding far more jar
files in lib than I am in share.  (I don't have Debian installed, but several revs of RPM-based
Linuxes. Both RHEL and Debian push the FHS as The Spec.  Interestingly enough, only OS X had
no jar files in lib and all in share.  But I think we can all agree that OS X falls into 'weirdo'
category in most cases...)  

FHS, BTW, also has this to say:

"It is recommended that application-specific, architecture-independent directories be placed
here. Such directories include groff, perl, ghostscript, texmf, and kbd (Linux) or syscons
(BSD). They may, however, be placed in /usr/lib for backwards compatibility, at the distributor's

>From here, two things:

1) So even though they give perl as an example, I know I have yet to work on an OS that was
built that way.  This is likely due to backward compatibility.

2) I think putting jar's in share is not in line with the spirit of the text or past history.
 /usr/share is meant to be content that could be NFS mounted from a common source (i.e., shared)
and not essential to a working system.  Reading through the FHS gives plenty of examples that
meet that intent: documentation, timezone files, dictionaries, and other misc files.

Owen> but for the cross-over, we'll need to support a directory that looks like a 
Owen> HADOOP_HOME with symlinks to the real files.

I'm going to play devil's advocate here and ask... Do we?  There are times when breaking backward
compatibility is a good thing.  I'd argue this is a great time to do it. I think we're young
enough to get away with it and given that this release will be majorly transitional anyway....
But if you guys are set on this, then what I've typically seen is /usr/(appname-version) dir
and populate it, essentially transitioning from a pseudo-SysV layout to FHS.

> Create an rpm integration project
> ---------------------------------
>                 Key: HADOOP-6255
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6255
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.20.100
>            Reporter: Owen O'Malley
>            Assignee: Eric Yang
>             Fix For: 0.20.100
>         Attachments: HADOOP-6255-branch-0.20-security.patch, HADOOP-6255.patch, deployment.pdf
> We should be able to create RPMs for Hadoop releases.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message