hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Latham <lat...@davelink.net>
Subject Re: Issues with import from 0.92 into 0.98
Date Wed, 27 May 2015 15:35:30 GMT
Sounds like quite a puzzle.

You mentioned that you can read data written through manual Puts from
the shell - but not data from the Import.  There must be something
different about the data itself once it's in the table.  Can you
compare a row that was imported to a row that was manually written -
or show them to us?

On Wed, May 27, 2015 at 7:09 AM,  <apache@borkbork.net> wrote:
> So more experimentation over the long weekend on this.
> If I load sample data into the new cluster table manually through the
> shell, column filters work as expected.
> Obviously not a solution to the problem. Anyone have any ideas or things
> I should be looking at? The regionserver logs show nothing unusual.
> Is there another export/import chain I could try?
> Thanks,
> Zack
> On Sun, May 24, 2015, at 11:43 AM, apache@borkbork.net wrote:
>> Hello all-
>> I'm hoping someone can point me in the right direction as I've exhausted
>> all my knowledge and abilities on the topic...
>> I've inherited an old, poorly configured and brittle CDH4 cluster
>> running HBase 0.92. I'm attempting to migrate the data to a new Ambari
>> cluster running HBase 0.98. I'm attempting to do this without changing
>> anything on the old cluster as I have hard enough time keeping it
>> running as is. Also, due to configuration issues with the old cluster
>> (on AWS), a direct HBase to HBase table copy, or even HDFS to HDFS copy
>> is out of the question at the moment.
>> I was able to use the export task on the old cluster to dump the HBase
>> tables to HDFS, which I then distcp s3n copied up to S3, then back down
>> to the new  cluster, then used the HBase importer. This appears to work
>> fine...
>> ... except that on the new cluster table scans with column filters do
>> not work.
>> A sample row looks something this:
>> A:9223370612274019807:twtr:56935907581904486 column=x:twitter:username,
>> timestamp=1424592575087, value=Bilo Selhi
>> Unfortunately, even though I can see the column is properly defined, I
>> cannot filter on it:
>> hbase(main):015:0> scan 'content' , {LIMIT=>10,
>> COLUMNS=>'x:twitter:username'}
>> ROW                           COLUMN+CELL
>> 0 row(s) in 352.7990 seconds
>> Any ideas what the heck is going here?
>> Here's the rough process I used for the export/import:
>> Old cluster:
>> $ hbase org.apache.hadoop.hbase.mapreduce.Driver export content
>> hdfs:///hbase_content
>> $ hadoop distcp -Dfs.s3n.awsAccessKeyId='xxxx'
>> -Dfs.s3n.awsSecretAccessKey='xxxx' -i hdfs:///hbase_content
>> s3n://hbase_content
>> New cluster:
>> $ hadoop distcp -Dfs.s3n.awsAccessKeyId='xxxx'
>> -Dfs.s3n.awsSecretAccessKey='xxxx' -i s3n://hbase_content
>> hdfs:///hbase_content
>> $ hbase -Dhbase.import.version=0.94
>> org.apache.hadoop.hbase.mapreduce.Driver import content
>> hdfs:///hbase_content
>> Thanks!
>> Z

View raw message