whirr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrei Savu <savu.and...@gmail.com>
Subject Re: AMIs to use when creating hadoop cluster with whirr
Date Wed, 05 Oct 2011 17:15:14 GMT
Looks like a network congestion issue to me. I don't know how to do this but
I would try to increase the heartbeat timeout.

Tom any ideas? Have you seen this before on aws?

I don't think there is something wrong with the AMI, I suspect there is
something wrong with the Hadoop configuration.

On Wednesday, October 5, 2011, John Conwell wrote:

> It starts with hadoop reporting bocks of data being 'lost', then individual
> data nodes stop responding, the individual data nodes get taken off line,
> then jobs get killed, then data nodes come back on line and the data blocks
> get replicated back out the correct replication factor.
>
> The end result are about 80% of the time, my hadoop jobs get killed because
> some task fails 3 times in a row, but about an hour after the job gets
> killed, all data nodes are back online and all data is fully replicated.
>
> Before I go rat holing down "why are my data nodes going down", I want to
> cover the easy scenarios like "oh yea...your totally misconfigured.  You
> should use ABC ami with the cloudera install and config scripts".  Basically
> validate if there are any best practices for setting up a cloudera
> distribution of hadoop on EC2.
>
> I know cloudera has created their own AMIs.  Should I be using them?  Does
> it matter?
>
>
>
> On Wed, Oct 5, 2011 at 9:43 AM, Andrei Savu <savu.andrei@gmail.com<javascript:_e({},
'cvml', 'savu.andrei@gmail.com');>
> > wrote:
>
>> What do you mean by failing? Is the Hadoop daemon shutting down or the
>> machine as a whole?
>>
>> On Wednesday, October 5, 2011, John Conwell wrote:
>>
>>> I'm having stability issues (data nodes constantly failing under very
>>> little load) on the hadoop clusters I'm creating, and I'm trying to figure
>>> out the best practice for creating the most stable hadoop environment on
>>> EC2.
>>>
>>> In order to run the cdh install and config scripts, I'm
>>> setting whirr.hadoop-install-function to install_cdh_hadoop, and
>>> whirr.hadoop-configure-function to configure_cdh_hadoop.  But I'm using a
>>> plain jane ubuntu amd64 ami (ami-da0cf8b3).  Should I also be using the
>>> cloudera AMIs as well as the cloudera install and config scripts.
>>>
>>> Are they any best practices for how to setup a cloudera distribution of
>>> hadoop on EC2?
>>>
>>> --
>>>
>>> Thanks,
>>> John C
>>>
>>>
>>
>> --
>> -- Andrei Savu / andreisavu.ro
>>
>>
>
>
> --
>
> Thanks,
> John C
>
>

-- 
-- Andrei Savu / andreisavu.ro

Mime
View raw message