hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Ganesan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9059) hadoop-daemons.sh script constraint that all the nodes should use the same installation path.
Date Thu, 06 Feb 2014 18:18:11 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893612#comment-13893612

Vivek Ganesan commented on HADOOP-9059:

The idea which i consider implementing is as follows:

1. Introduce a new optional configuration file (by introducing a new configuration setting
to point to the file) which gives paths of slave nodes' hadoop installation in the format
<ip_address>;<hadoop_installation_path>, one in each line.
2. Parse the new configuration file during master startup and proceed with starting hadoop
in slaves using ssh.

Any comments/improvements/suggestions would be greatly welcome.

Also, I am tagging this JIRA as an improvement, since this is not a bug.  Necessity to have
same HADOOP_HOME path is documented in hadoop documentation http://hadoop.apache.org/docs/r1.1.1/cluster_setup.html)

> hadoop-daemons.sh script constraint that all the nodes should use the same installation
> ---------------------------------------------------------------------------------------------
>                 Key: HADOOP-9059
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9059
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: bin
>    Affects Versions: 1.0.4
>         Environment: Linux
>            Reporter: Chunliang Lu
>            Assignee: Vivek Ganesan
>            Priority: Critical
>   Original Estimate: 25h
>  Remaining Estimate: 25h
> To run command on all slave hosts, the bin/hadoop-daemons.sh will call the bin/slaves.sh
at last line:
> {code}
> exec "$bin/slaves.sh" --config $HADOOP_CONF_DIR cd "$HADOOP_HOME" \; "$bin/hadoop-daemon.sh"
--config $HADOOP_CONF_DIR "$@"
> {code}
> where slaves.sh will call ssh and pass the `cd "$HADOOP_HOME" \; "$bin/hadoop-daemon.sh"
--config $HADOOP_CONF_DIR "$@"` part to the slaves. In bash, the $HADOOP_HOME $bin, and $HADOOP_CONF_DIR
will be replaced as current settings on the master, which means that this constraints that
all the slave nodes need to share the same path setting as master node. This is not reasonable.
In my setting, the cluster has a shared NFS, and I would like to use different configuration
files for different machines. I know this is not a recommended way to manage clusters, but
I just have no choice. I think other people may face the same problem. How about replace it
like following and allow different configuration for master and slaves?
> {code}
> exec "$bin/slaves.sh" --config $HADOOP_CONF_DIR cd '$HADOOP_PREFIX' \; "bin/hadoop-daemon.sh"
> {code}

This message was sent by Atlassian JIRA

View raw message