hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6465) Write a Terremark cloud provider
Date Tue, 05 Jan 2010 19:13:54 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796806#action_12796806
] 

Aaron Kimball commented on HADOOP-6465:
---------------------------------------

Tom,

Can you talk a bit about how you tested this? I see some unit tests, but that doesn't actually
cover interaction with a terremark service.

As for the patch:


readme-vcloud:

regarding {{TERREMARK_KEY/SECRET}}, are these also specifiable via command-line options? I
feel like they should be. (Ditto with AWS credentials if they aren't already.)

.. This should also be specifiable in the configuration file. 

{{ssh-keygen -f id_rsa_rackspace -P ''}} -- rackspace? Shouldn't this example say "terremark"?

{quote}
Note: you should use short cluster name identifiers, here "tm", (no more than
four characters), since they are used as a part of the nstance name, which
is limited to 15 characters in Terremark.
{quote}

The program itself should warn you about this, if it doesn't already. (If this is going to
cause users problems, don't count on them having read the README.)


hadoop-terremark-init-remote.sh:

In update_repo(), you have {{sudo apt-get update}} and {{yum update -y yum}}. If one requires
the {{sudo}}, so should the other. Also, shouldn't they both take a {{-y}} argument?

Function {{install_java()}} -- this looks like it only works with {{dpkg}}. Is there a {{yum}}-based
equivalent? If one is not necessary, put that in a comment?

install_base_packages() similarly is debian-specific and does not have a yum equivalent?

make_hadoop_dirs() allows multiple hadoop mount points. configure_hadoop is hard-coded to
{{/data}} -- shouldn't this be parameterized too?

"Hadoop logs should be on the /mnt partition" -- you mean {{/data}} ?

start_namenode() and start_daemon() both include redundant logic to determine the value of
{{$AS_HADOOP}} - consider factoring into a method.



vcloud.py:

you perform re.match() statements to dissect node-names of the form cluster-role-nodeId. Can
you use some constants so instead of {{re.group(1)}} you have {{re.group(ROLE_PART)}}, CLUSTER_PART,
NODE_ID_PART, etc?




> Write a Terremark cloud provider
> --------------------------------
>
>                 Key: HADOOP-6465
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6465
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: contrib/cloud
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: HADOOP-6465.patch
>
>
> The scripts in contrib/cloud currently only support running on EC2. This issue is to
add support for running Hadoop clusters on Terremark's vCloud Express platform.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message