hi, guys,
May i load hfiles by pre-split regions when copying table from other
cluster to a new cluster?
here is what i do at now:
1. distcp hfile folder from source cluster to destination cluster.
2. grep the regions from the hbase table folder, and use it to pre-split
table in destination cluster, i expect to avoid the hfile splitting.
3. use hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles to
load regions files to tables.
for example,
i got a table hfiles on hdfs from source cluster like this:
drwxr-xr-x - hbase hbase 0 2015-06-25 22:32
/hbase/data/itemtest/Table1/.tabledesc
drwxr-xr-x - hbase hbase 0 2015-06-25 22:32
/hbase/data/itemtest/Table1/.tmp
drwxr-xr-x - hbase hbase 0 2015-06-25 22:32
/hbase/data/itemtest/Table1/11111111111111111111111111111111
drwxr-xr-x - hbase hbase 0 2015-06-25 22:32
/hbase/data/itemtest/Table1/22222222222222222222222222222222
drwxr-xr-x - hbase hbase 0 2015-06-25 22:32
/hbase/data/itemtest/Table1/33333333333333333333333333333333
drwxr-xr-x - hbase hbase 0 2015-06-25 22:32
/hbase/data/itemtest/Table1/44444444444444444444444444444444
then i want to create a pre-split table with these regions
['11111111111111111111111111111111','22222222222222222222222222222222','33333333333333333333333333333333','44444444444444444444444444444444'],
so in hbase shell, do this:
hbase(main):001:0> create 'itemtest:Table1', {NAME => 'BaseInfo',
DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER =>'NONE', REPLICATION_SCOPE =>
'0', COMPRESSION => 'SNAPPY', VERSIONS => '1', TTL => '2147483647',
MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536',
IN_MEMORY => 'false'}, {MAX_FILESIZE => 10737418240}, {SPLITS =>
['11111111111111111111111111111111','22222222222222222222222222222222','33333333333333333333333333333333','44444444444444444444444444444444']}
0 row(s) in 0.8080 seconds
but when i enter the hdfs path, i find the region folder name is same as
source cluster.
[tw79@e3ecmrhdp03 ~]$ hadoop fs -ls /hbase/data/itemtest/Table1
Found 7 items
drwxr-xr-x - hbase hbase 0 2015-06-26 09:41
/hbase/data/itemtest/Table1/.tabledesc
drwxr-xr-x - hbase hbase 0 2015-06-26 09:41
/hbase/data/itemtest/Table1/.tmp
drwxr-xr-x - hbase hbase 0 2015-06-26 09:41
/hbase/data/itemtest/Table1/09f1d9847762b45c5f095bb9b5dad986
drwxr-xr-x - hbase hbase 0 2015-06-26 09:41
/hbase/data/itemtest/Table1/0df1dbcc531b451504238c21ec1c06b9
drwxr-xr-x - hbase hbase 0 2015-06-26 09:41
/hbase/data/itemtest/Table1/285be1d39434ab6599f3966537229db5
drwxr-xr-x - hbase hbase 0 2015-06-26 09:41
/hbase/data/itemtest/Table1/2b8944234f41ad29a5722f87ac954d2c
drwxr-xr-x - hbase hbase 0 2015-06-26 09:41
/hbase/data/itemtest/Table1/e337e82c720a8559fdabd611e1ee21b3
any suggestion on this?
Thanks.
tonywutao
|