hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: NoRouteToHostException causes Master abort when the RegionServer hosting ROOT is not available
Date Fri, 01 Apr 2011 15:57:11 GMT
The below looks like HBASE-3660, 'HMaster will exit when starting with
stale data in cached locations such as -ROOT- or .META.', included in
0.90.2 RC.

On Fri, Apr 1, 2011 at 8:48 AM, Brush,Ryan <RBRUSH@cerner.com> wrote:
> This happens in similar conditions but is distinct from HBASE-3617. When the region hosting
ROOT isn't available during restart, the NoRouteToHostException propagates all the way up
the call stack and causes the master to abort.  It looks like this can be addressed by handling
NoRouteToHostException at some point and considering that node/region server offline.
> I applied the patch from HBASE-3617 and it didn't fix the problem I'm seeing, which I
expected given the stack trace below.  Assuming this reasoning is correct, does this merit
a separate JIRA?  It does seem critical in that the failure of a single node is preventing
us from being up our cluster.
> 2011-04-01 10:15:19,472 INFO org.apache.hadoop.hbase.master.ServerManager: Exiting wait
on regionserver(s) to checkin; count=2, stopped=false, count of regions out on cluster=0
> 2011-04-01 10:15:19,486 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder
belongs to an existing region server
> 2011-04-01 10:15:19,486 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder
belongs to an existing region server
> 2011-04-01 10:15:22,508 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception.
Starting shutdown.
> java.net.NoRouteToHostException: No route to host
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
>     at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
>     at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
>     at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>     at $Proxy6.getProtocolVersion(Unknown Source)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
>     at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:954)
>     at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:385)
>     at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:211)
>     at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:458)
>     at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:425)
>     at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:383)
>     at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:278)
> 2011-04-01 10:15:22,510 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2011-04-01 10:15:22,510 DEBUG org.apache.hadoop.hbase.master.HMaster: Stopping service
> ----------------------------------------------------------------------
> CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation
and are intended only for the addressee. The information contained in this message is confidential
and may constitute inside or non-public information under international, federal, or state
securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such
information is strictly prohibited and may be unlawful. If you are not the addressee, please
promptly delete this message and notify the sender of the delivery error by e-mail or you
may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.

View raw message