hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Ramakrishnan <aramakrish...@languageweaver.com>
Subject RE: hbase regions reporting multiple times
Date Mon, 12 Jul 2010 17:47:14 GMT
First regarding the DNS issue line of thought, I tried to debug after reading http://www.mail-archive.com/hbase-dev@hadoop.apache.org/msg16982.html
But, the hosts are not multi-homed and also I don't see logs indicating the problem like in
the post above.

I restarted hbase after removing these new nodes that were giving problems. It's been stable
now. I am going to try adding these nodes incrementally to see how things go.

It's quite likely that the .meta. table blocks moved, since I reconfigured hdfs. I am wondering
what can be done to avoid this block moving problem or recover from it. Just give hbase some
time and let it figure things out ?

J-D, which problem is normal ? Just the error message or the issue of regions reported multiple
times ?. Could you give some details on how loosing the last few edits could cause this problem.

Also loosing last few edits could be taken care of  by calling a flush before shutting hbase
down if it dosent already do it, right ?


-----Original Message-----
From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
Sent: Sunday, July 11, 2010 5:36 PM
To: user@hbase.apache.org
Subject: Re: hbase regions reporting multiple times

This problem is "normal" or at least "expected" since Hadoop doesn't
support fsSync, so the last edits that went to a region server are
lost. Fortunately this is finally fixed in the upcoming release.


On Sat, Jul 10, 2010 at 3:15 AM, Jamie Cockrill
<jamie.cockrill@gmail.com> wrote:
> Arun,
> I had a very similar issue with my cluster when the regionserver with
> the .META. table on it crashed. It crippled the cluster for a while,
> but after shutting various things down and restarting them again, it
> seemed to work itself out eventually. I had to do this a few times and
> unfortunately I didn't keep a record of the order in which I shut
> things down and restarted them.
> The problem seemed to stem from the master thinking that META was
> stored on a node and that node having no knowledge of ever having held
> it. I tried a few major_compact of META, hoping that would fix it, but
> each failed with the same exception as below. The weird thing was that
> I could see (through the web UI on master) that META was now being
> held on a different regionserver.
> I wouldn't necessarily follow my lead in randomly shutting things down
> and hoping for the best as it may well have been something entirely
> different that fixed the issue in the end. If all else fails, try
> restarting the master and the regionservers a few times and see if
> that works out the kinks.
> thanks
> Jamie
> On 10 July 2010 04:48, Ryan Rawson <ryanobjc@gmail.com> wrote:
>> Others will have to chime in for details, but typically this means you
>> are having DNS issues.  That is the hostname is resolving to an ip and
>> not resolving back to the same name or vice versa or any other combo
>> of non-roundtripping involving ip and dns names.
>> -ryan
>> On Fri, Jul 9, 2010 at 6:41 PM, Arun Ramakrishnan
>> <aramakrishnan@languageweaver.com> wrote:
>>> I shutdown hbase. Added some new nodes to hdfs, rebalanced. Also added those
nodes to hbase regionservers.
>>> Then started hbase.
>>> I am having this strange problem where the new nodes let's say host1 thru host4
gets repeatedly reported/added to the regionservers list.
>>> Initially when I did a "report 'simple'" from the shell, it showed me 10 unique
hosts. Then within a matter of minutes it grew to 17 ( with the newly added hosts repeating
multiple times).
>>> Also, the web UI failed with the following error.
>>> ##############
>>> HTTP ERROR: 500
>>> Trying to contact region server for region .META.,,1, row
'', but failed after 3 attempts.
>>> Exceptions:
>>> org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException:
>>>        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2266)
>>>        at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1845)
>>>        at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
>>>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
>>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>>> ###############
>>> Any insight into why the regions get repeated multiple times. I did a  hadoop
fsck / and it reports that all the blocks have been replicated 3 times ( the configured value
>>> Thanks
>>> Arun

View raw message