hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Copying data from one Hbase cluster to Another Hbase cluster
Date Sat, 15 Feb 2014 13:13:34 GMT
Hi Vimal,

While you MR is running, if you don't use HBase at the same time, you might
be able to reduce it's memory allocation? Also, I don't think you need to
give more than 1GB to the MR framework if you only do a distcp...

JM


2014-02-15 Vimal Jain <vkjk89@gmail.com>:

> Hi all ,
> I dont have anything against MR , only problem is that my machine is little
> low end ( has only 2 core and 8 GB RAM ).
> Thats why i am insisting on MR free approach to achieve the goal.
>
>
> On Sat, Feb 15, 2014 at 3:50 PM, Harsh J <harsh@cloudera.com> wrote:
>
> > Note that a long-running MR service is not a requirement, and that MR
> > can be used just as a speedy facilitator. Nothing's gonna go wrong if
> > you shutdown your MR services right after your parallel copy (via
> > distcp/etc.) has completed.
> >
> > On Sat, Feb 15, 2014 at 9:39 AM, divye sheth <divs.sheth@gmail.com>
> wrote:
> > > You could try the hadoop distcp command to transfer the hbase directory
> > > from one cluster to other. This does not require u to setup mapreduce,
> it
> > > will start a mapred job in local mode i.e. single mapper. When copying
> > from
> > > one cluster to another remember not to copy -ROOT- and .META.
> > > I have used this method without facing any data loss. After the copy is
> > > complete start ur new hbase it should be able to read the contents and
> > > build region infornation from new directory.
> > >
> > > Thanks
> > > D
> > > On Feb 14, 2014 5:45 PM, "Samir Ahmic" <ahmic.samir@gmail.com> wrote:
> > >
> > >> Well that depends on size of your dataset. You can use hadoop
> > -copyToLocal
> > >> to copy  /hbase directory to local disk or some other storage device
> > that
> > >> is mounted on your original cluster. Then you can copy /hbase dir to
> > second
> > >> cluster with hadoop -copyFromLocal . Of course this will require that
> > >> source and destionation hbase cluster are offline. I have never used
> > this
> > >> approach but it should work.
> > >>
> > >> Regards
> > >>
> > >>
> > >>
> > >>
> > >> On Fri, Feb 14, 2014 at 11:15 AM, Vimal Jain <vkjk89@gmail.com>
> wrote:
> > >>
> > >> > Hi Samir,
> > >> > As far as i know all these techniques require map reduce daemons to
> > be up
> > >> > on source and destination cluster.
> > >> > Is there any other solution which does not require map reduce at
> all ?
> > >> >
> > >> >
> > >> > On Fri, Feb 14, 2014 at 2:41 PM, Samir Ahmic <ahmic.samir@gmail.com
> >
> > >> > wrote:
> > >> >
> > >> > > Hi Vimal,
> > >> > >
> > >> > > I have few options how to move data from one hbase cluster to
> > another:
> > >> > >
> > >> > >
> > >> > >    1. You can use org.apache.hadoop.hbase.mapreduce.Export tool
to
> > >> export
> > >> > >    tables to HDFS and then you can use hadoop distcp to move
data
> to
> > >> > > another
> > >> > >    cluster. When data is place on second cluster you can use
> > >> > >    org.apache.hadoop.hbase.mapreduce.Import tool to import tables.
> > >> Please
> > >> > >     look at http://hbase.apache.org/book.html#export.
> > >> > >    2. Second option is to us CopyTable tool, please look at:
> > >> > >    http://hbase.apache.org/book.html#copytable
> > >> > >    3. Third option is to enable hbase Snapshots,  create table
> > >> snapshots,
> > >> > >    and then use ExportSnapshot tool to move them to second
> cluster.
> > >> When
> > >> > >    snapshots are on second cluster you can clone tables from
> > snapshots.
> > >> > > Please
> > >> > >    look: http://hbase.apache.org/book.html#ops.snapshots
> > >> > >
> > >> > > I was using 1 and 3 for moving data between clusters and i in
my
> > case 3
> > >> > was
> > >> > > better solution.
> > >> > >
> > >> > > Regards
> > >> > > Samir
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Fri, Feb 14, 2014 at 8:33 AM, Vimal Jain <vkjk89@gmail.com>
> > wrote:
> > >> > >
> > >> > > > Hi,
> > >> > > > I have Hbase and Hadoop setup in pseudo distributed mode
in
> > >> production.
> > >> > > > Now i am planning to move from pseudo distributed mode to
fully
> > >> > > distributed
> > >> > > > mode ( 2 node cluster).
> > >> > > > My existing Hbase and Hadoop version are 1.1.2  and  0.94.7.
> > >> > > > And i am planning to have full distributed mode with Hbase
> version
> > >> > > 0.94.16
> > >> > > > and Hadoop version ( either 1.X or 2.X , not yet decided
).
> > >> > > >
> > >> > > > What are different ways to copy data from existing setup
(
> pseudo
> > >> > > > distributed mode ) to this new setup ( 2 node fully distributed
> > >> mode).
> > >> > > >
> > >> > > > Please help.
> > >> > > >
> > >> > > > --
> > >> > > > Thanks and Regards,
> > >> > > > Vimal Jain
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Thanks and Regards,
> > >> > Vimal Jain
> > >> >
> > >>
> >
> >
> >
> > --
> > Harsh J
> >
>
>
>
> --
> Thanks and Regards,
> Vimal Jain
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message