hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amlan Roy" <amlan....@cleartrip.com>
Subject RE: Hbase bkup options
Date Mon, 23 Jul 2012 15:33:07 GMT
Hi Michael,

Thanks a lot for the reply. What I want to achieve is, if my cluster goes
down for some reason, I should be able to create a new cluster and should be
able to import all the backed up data. As I want to store all the tables, I
expect the data size to be huge (in order of Tera Bytes) and it will keep

If I have understood correctly, you have suggested to run "export" to get
the data into hdfs and then run "hadoop fs -copyToLocal" to get it into
local file. If I take a back up of the files, is it possible to import that
data to a new Hbase cluster?

Thanks and regards,

-----Original Message-----
From: Michael Segel [mailto:michael_segel@hotmail.com] 
Sent: Monday, July 23, 2012 8:19 PM
To: user@hbase.apache.org
Subject: Re: Hbase bkup options


Like always the answer to your question is... it depends.

First, how much data are we talking about? 

What's the value of the underlying data? 

One possible scenario...
You run a M/R job to copy data from the table to an HDFS file, that is then
copied to attached storage on an edge node and then to tape. 
Depending on how much data, how much disk is in the attached storage you may
want to keep a warm copy there, a 'warmer/hot' copy on HDFS and then a cold
copy on tape off to some offsite storage facility.

There are other options, but it all depends on what you want to achieve. 

With respect to the other tools...

You can export  (which is a m/r job) to a local directory, then use distcp
to a different cluster.  hadoop fs -copyToLocal will let you copy off the
You could write your own code, but you don't get much gain over existing
UNIX/Linux tools. 

On Jul 23, 2012, at 7:52 AM, Amlan Roy wrote:

> Hi,
> Is it feasible to do disk or tape backup for Hbase tables?
> I have read about the tools like Export, CopyTable, Distcp. It seems like
> they will require a separate HDFS cluster to do that.
> Regards,
> Amlan

View raw message