hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saad Mufti <saad.mu...@gmail.com>
Subject Re: HBase failed on local exception and failed servers list.
Date Sun, 11 Mar 2018 01:25:09 GMT
Are you using AuthUtil class to reauthenticate? This class is in Hbase, and
uses the Hadoop class UserGroupInformation to do the actual login and
re-login. But, if your UserGroupInformation class is from Hadoop 2.5.1 or
earlier, it has a bug if you are using Java 8, as most of us are. The
relogin code uses a test to decide whether the login is kerberos/keytab
based, and that test used to pass on Java 7 but fails in Java 8 because the
test tests for some specific class being in some underlying list of
kerberos objects assigned to your principal, which has disappeared in the
Java 8 implementation. We fixed this by upgrading our Hadoop dependency
explicitly to a newer version, in our case 2.6.1 and they have fixed this
problem in that newer version.

If this is the condition affecting your application, it is an easy enough
fix.

Hope this helps.

Cheers.

----
Saad



On Tue, Feb 27, 2018 at 1:16 PM, apratim sharma <apratim.sharma@gmail.com>
wrote:

> Hi Guys,
>
> I am using hbase 1.2.0 on a kerberos secured cloudera CDH 5.8 cluster.
> I have a persistant application that authenticates using keytab and creates
> hbase connection. Our code also takes care of reauthentication and
> recreating broken connectiion.
> The code worked fine in previous versions of hbase. However what we see
> with Hbase 1.2 is that after 24 hours the hbase connection does not work
> giving following error
>
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=2, exceptions:
> Tue Feb 13 12:57:51 PST 2018,
> RpcRetryingCaller{globalStartTime=1518555467140, pause=100, retries=2},
> org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to
> pdmcdh01.xyz.com/192.168.145.62:60020 failed on local exception:
> org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection
> to pdmcdh01.xyz.com/192.168.145.62:60020 is closing. Call id=137,
> waitTime=11
> Tue Feb 13 12:58:01 PST 2018,
> RpcRetryingCaller{globalStartTime=1518555467140, pause=100, retries=2},
> org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to
> pdmcdh01.xyz.com/192.168.145.62:60020 failed on local exception:
> org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection
> to pdmcdh01.xyz.com/192.168.145.62:60020 is closing. Call id=139,
> waitTime=13
>
>         at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> RpcRetryingCaller.java:147)
>         at org.apache.hadoop.hbase.client.HTable.get(HTable.java:935)
>         at org.apache.hadoop.hbase.client.HTable.get(HTable.java:901)
> Our code reauthnticates and creates connection again but it still keeps
> failing
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=2, exceptions:
> Wed Feb 21 14:30:31 PST 2018,
> RpcRetryingCaller{globalStartTime=1519252219159, pause=100, retries=2},
> java.io.IOException: Couldn't setup connection for pipe@HADOOP.XYZ.COM to
> hbase/pdmcdh01.xyz.com@HADOOP.XYZ.COM
> Wed Feb 21 14:30:31 PST 2018,
> RpcRetryingCaller{globalStartTime=1519252219159, pause=100, retries=2},
> org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the
> failed servers list: pdmcdh01.xyz.com/192.168.145.62:60020
>
>         at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> RpcRetryingCaller.java:147)
>         at org.apache.hadoop.hbase.client.HTable.get(HTable.java:935)
>         at org.apache.hadoop.hbase.client.HTable.get(HTable.java:901)
> I know that client keeps server in the failed list for few seconds in order
> to reduce too many connection attempts. So I waited and tried after some
> time but still same error.
> Once we restart our application everything starts working fine again for
> next 24 hours.
>
> This 24 hours gap indicates that it could be something related to Kerberos
> ticket expiry time, however there is no log to indicate Kerberos
> authentication issue.
> Moreover we are handling the exception and trying to authenticate and
> create connection again but nothing works until we restart JVM. this is
> very strange.
>
> I would really appreciate any help or pointers on this issue.
>
> Thanks a lot
> Apratim
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message