whirr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Conwell <j...@iamjohn.me>
Subject Re: AMIs to use when creating hadoop cluster with whirr
Date Wed, 05 Oct 2011 16:52:21 GMT
It starts with hadoop reporting bocks of data being 'lost', then individual
data nodes stop responding, the individual data nodes get taken off line,
then jobs get killed, then data nodes come back on line and the data blocks
get replicated back out the correct replication factor.

The end result are about 80% of the time, my hadoop jobs get killed because
some task fails 3 times in a row, but about an hour after the job gets
killed, all data nodes are back online and all data is fully replicated.

Before I go rat holing down "why are my data nodes going down", I want to
cover the easy scenarios like "oh yea...your totally misconfigured.  You
should use ABC ami with the cloudera install and config scripts".  Basically
validate if there are any best practices for setting up a cloudera
distribution of hadoop on EC2.

I know cloudera has created their own AMIs.  Should I be using them?  Does
it matter?



On Wed, Oct 5, 2011 at 9:43 AM, Andrei Savu <savu.andrei@gmail.com> wrote:

> What do you mean by failing? Is the Hadoop daemon shutting down or the
> machine as a whole?
>
> On Wednesday, October 5, 2011, John Conwell wrote:
>
>> I'm having stability issues (data nodes constantly failing under very
>> little load) on the hadoop clusters I'm creating, and I'm trying to figure
>> out the best practice for creating the most stable hadoop environment on
>> EC2.
>>
>> In order to run the cdh install and config scripts, I'm
>> setting whirr.hadoop-install-function to install_cdh_hadoop, and
>> whirr.hadoop-configure-function to configure_cdh_hadoop.  But I'm using a
>> plain jane ubuntu amd64 ami (ami-da0cf8b3).  Should I also be using the
>> cloudera AMIs as well as the cloudera install and config scripts.
>>
>> Are they any best practices for how to setup a cloudera distribution of
>> hadoop on EC2?
>>
>> --
>>
>> Thanks,
>> John C
>>
>>
>
> --
> -- Andrei Savu / andreisavu.ro
>
>


-- 

Thanks,
John C

Mime
View raw message