hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joost Ouwerkerk <joo...@openplaces.com>
Subject Re: HBase/Hadoop backups in EC2
Date Fri, 21 Mar 2008 03:43:26 GMT
We're actually hoping to use Hbase in an online production capacity,  
so I don't think we could shutdown regularly enough to protect the  
data.  We could probably stand to lose about 15 minutes worth of data,  
but couldn't shut down more than once a day. Ideally we'd have some  
kind of incremental replication mechanism.  I imagine that this kind  
of question is among the least of your concerns at this stage in  
hbase's evolution, though.

On 20-Mar-08, at 11:00 PM, stack wrote:

> Can you afford to shutdown hbase?  Shutting down hbase will force it  
> to dump whats in-memory out to the Filesystem.  Once its down, run a  
> distcp between your EC2 HDFS and S3?  Would that work for you?
> Otherwise, there is no mechanism of taking a snapshot of hbase  
> currently (HBASE-50 is about the issue).
> St.Ack
> Joost Ouwerkerk wrote:
>> Does anyone have any experience with backing up data for an HBase  
>> cluster on Amazon EC2 instances?  We could take snapshots of the  
>> filesystem on a regular basis and dump to S3, but I was wondering  
>> if anyone had any other strategies to recommend?  EC2 instances can  
>> theoretically drop at anytime (although this hasn't happened to us  
>> yet), and although HDFS is distributed, there's no guarantee that  
>> all our EC2 instances aren't located on the same Amazon node when  
>> it goes down.
>> ---
>> Joost

View raw message