hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Possible solution to 'WrongRegionException and inconsistent table found'
Date Wed, 06 Jul 2011 19:21:09 GMT
On Wed, Jul 6, 2011 at 5:37 AM, Xu-Feng Mao <m9suns@gmail.com> wrote:
> I looks like we've lost a region, include the directory on hdfs and its meta
> record as well. We need some more time to dig into the log sea, to figure
> out the root cause.

You think it was https://issues.apache.org/jira/browse/HBASE-3872?

> But first of all, we need to recover the meta, so that we can put keys in
> that region. My understanding is the check_meta.rb and add_table.rb could
> fix some meta issues in case the directory on hdfs and its .regioninfo still
> exists.

Yes.  add_table.rb will go out on fs and find regions for the table
and rewrite that portion of .META.  In 0.90 it will not assign them
though you will likely need to disable then reenable the table to get
the regions out on the cluster.

Check_meta is likely the same.  It looks for the hole and if you pass
the -fix, will create a new region to plug the hole.  This is probably
what you need (You may need to assign the region post running the

> I modified the check_meta.rb, to achieve the insertion. I've tried in our
> environment, it seems work, at least hbase hbck tells me okay. I attached it
> with this message.Any comments is great appreciated.


> I have one more question. I create the new region record with both startkey
> and endkey set, it seems possible that if we're unlucky, during the
> insertion, some split happens, then we might lead to overlap region. I
> wonder how hbase handles this sort of problems generally.

Well, you can't do cross-row transactions which is sort of what you
would need here in this case so, yes, its possible that there could be
overlap, though, didn't you say the region was missing? (If so, how
could it split?).

> When I was playing with the test environment, I saw message like some region
> 'is multiply assigned to region servers', it is also a inconsistent
> scenario, how can I recover this problem?

Can you figure how this double-assign happened?

To 'recover' you'd close it on one of the regionservers.  Send a
close_region 'REGION_NAME', 'SERVER_NAME' in the shell (Read the shell
close_region help to be sure for my memory is not reliable).


View raw message