hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Viraj Jasani <vjas...@apache.org>
Subject Re: regionserver can't connect to master
Date Wed, 25 Mar 2020 11:34:47 GMT
Thanks Manuel

Ok so connection to port 16000 looks all good from RegionServer. At this point, we are only
left with trying to start HM and RS again and debug further. I hope 16010 should also be accessible
from UI.
 
Btw API response that you provided above should be for all services present in blueprint and
not necessarily for only running services, it includes HBase but HM and RS are down. 
Anyways, it is recommended to bring up cluster with stable version: https://downloads.apache.org/hbase/stable/


On 2020/03/23 05:14:58, Manuel Sopena Ballesteros <manuel.sb@garvan.org.au> wrote: 
> Hi Jasani,
> 
> 
> Which HBase version are you using?
> 
> [luffy@gl-hdp-ctrl03 ~]$ hbase version
> 
> SLF4J: Class path contains multiple SLF4J bindings.
> 
> SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/phoenix/phoenix-5.0.0.3.1.0.0-78-server.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> 
> SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> 
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 
> HBase 2.0.2.3.1.0.0-78
> 
> Source code repository git://ctr-e138-1518143905142-586755-01-000023.hwx.site/grid/0/jenkins/workspace/HDP-parallel-centos7/SOURCES/hbase
revision=
> 
> Compiled by jenkins on Thu Dec  6 12:27:45 UTC 2018
> 
> From source with checksum 015c34650c163b249d16fc7e496a030e
> 
> 
> You are bringing up fresh cluster and not doing an upgrade right?
> 
> Yes this is a fresh cluster I am deploying through ambari blueprints (I always reset
ambari to factory settings before deploy the blueprint)
> 
> 
> Has Ambari successfully brought up NameNodes and DataNodes?
> 
> I think so
> 
> [cid:0cef77dd-f616-45ef-8214-e0bb0006b665]
> 
> 
> How-many components are already running so far?
> 
> {
> 
>   "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services",
> 
>   "items" : [
> 
>     {
> 
>       "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/AMBARI_METRICS",
> 
>       "ServiceInfo" : {
> 
>         "cluster_name" : "Grandline",
> 
>         "service_name" : "AMBARI_METRICS"
> 
>       }
> 
>     },
> 
>     {
> 
>       "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/HBASE",
> 
>       "ServiceInfo" : {
> 
>         "cluster_name" : "Grandline",
> 
>         "service_name" : "HBASE"
> 
>       }
> 
>     },
> 
>     {
> 
>       "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/HDFS",
> 
>       "ServiceInfo" : {
> 
>         "cluster_name" : "Grandline",
> 
>         "service_name" : "HDFS"
> 
>       }
> 
>     },
> 
>     {
> 
>       "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/HIVE",
> 
>       "ServiceInfo" : {
> 
>         "cluster_name" : "Grandline",
> 
>         "service_name" : "HIVE"
> 
>       }
> 
>     },
> 
>     {
> 
>       "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/MAPREDUCE2",
> 
>       "ServiceInfo" : {
> 
>         "cluster_name" : "Grandline",
> 
>         "service_name" : "MAPREDUCE2"
> 
>       }
> 
>     },
> 
>     {
> 
>       "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/SMARTSENSE",
> 
>       "ServiceInfo" : {
> 
>         "cluster_name" : "Grandline",
> 
>         "service_name" : "SMARTSENSE"
> 
>       }
> 
>     },
> 
>     {
> 
>       "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/SPARK2",
> 
>       "ServiceInfo" : {
> 
>         "cluster_name" : "Grandline",
> 
>         "service_name" : "SPARK2"
> 
>       }
> 
>     },
> 
>     {
> 
>       "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/TEZ",
> 
>       "ServiceInfo" : {
> 
>         "cluster_name" : "Grandline",
> 
>         "service_name" : "TEZ"
> 
>       }
> 
>     },
> 
>     {
> 
>       "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/YARN",
> 
>       "ServiceInfo" : {
> 
>         "cluster_name" : "Grandline",
> 
>         "service_name" : "YARN"
> 
>       }
> 
>     },
> 
>     {
> 
>       "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/ZEPPELIN",
> 
>       "ServiceInfo" : {
> 
>         "cluster_name" : "Grandline",
> 
>         "service_name" : "ZEPPELIN"
> 
>       }
> 
>     },
> 
>     {
> 
>       "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/ZOOKEEPER",
> 
>       "ServiceInfo" : {
> 
>         "cluster_name" : "Grandline",
> 
>         "service_name" : "ZOOKEEPER"
> 
>       }
> 
>     }
> 
>   ]
> 
> }
> 
> 
> Are they connected(e.g. NN and DN) and only RS is having trouble connecting to HM?
> 
> Yes, this is my understanding
> 
> 
> Although telnet seems correct, can you also try "nc -zv gl-hdp-ctrl03.local 16000" from
RS just to double check?
> 
> $ nc -zv gl-hdp-ctrl03.local 16000
> 
> Ncat: Version 7.50 ( https://nmap.org/ncat )
> 
> Ncat: Connected to 192.168.20.248:16000.
> 
> Ncat: 0 bytes sent, 0 bytes received in 0.02 seconds.
> 
> 
> thank you
> 
> ________________________________
> From: Viraj Jasani <vjasani@apache.org>
> Sent: Sunday, 22 March 2020 2:47:09 AM
> To: user@hbase.apache.org
> Subject: Re: regionserver can't connect to master
> 
> Which HBase version are you using? You are bringing up fresh cluster and not doing an
upgrade right? Has Ambari successfully brought up NameNodes and DataNodes? How-many components
are already running so far? Are they connected(e.g. NN and DN) and only RS is having trouble
connecting to HM? Although telnet seems correct, can you also try "nc -zv gl-hdp-ctrl03.local
16000" from RS just to double check?
> Thanks
> 
> On 2020/03/20 23:45:28, Manuel Sopena Ballesteros <manuel.sb@garvan.org.au> wrote:
> > Dear HBase community,
> >
> > I am having an issue with my ambari hbase deployment where regionserver is not able
to connect to master
> >
> > Hbase Master log files:
> > 2020-03-21 02:36:53,614 INFO [Thread-16] master.ServerManager: Waiting on regionserver
count=0; waited=3174901ms, expecting min=1 server(s), max=NO_LIMIT server(s), timeout=30000ms,
lastChange=-3174901ms
> > 2020-03-21 02:36:54,287 WARN [master/gl-hdp-ctrl03:16000] assignment.AssignmentManager:
No servers available; cannot place 1 unassigned regions.
> >
> > Hbase region server logs:
> > Caused by: java.net.ConnectException: Call to gl-hdp-ctrl03.local/192.168.20.248:16000
failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException:
connection timed out: gl-hdp-ctrl03.local/192.168.20.248:16000
> > at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:166)
> > at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
> > at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95)
> > at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410)
> > at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406)
> > at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:103)
> > at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:118)
> >
> >
> > Test connectivity from region server to master
> > $ telnet gl-hdp-ctrl03.local 16000
> > Trying 192.168.20.248...
> > Connected to gl-hdp-ctrl03.local.
> > Escape character is '^]'.
> >
> > Any idea of why region can't connect?
> >
> > Thank you very much
> > NOTICE
> > Please consider the environment before printing this email. This message and any
attachments are intended for the addressee named and may contain legally privileged/confidential/copyright
information. If you are not the intended recipient, you should not read, use, disclose, copy
or distribute this communication. If you have received this message in error please notify
us at once by return email and then delete both messages. We accept no liability for the distribution
of viruses or similar in electronic communications. This notice should not be removed.
> >
> 
> NOTICE
> Please consider the environment before printing this email. This message and any attachments
are intended for the addressee named and may contain legally privileged/confidential/copyright
information. If you are not the intended recipient, you should not read, use, disclose, copy
or distribute this communication. If you have received this message in error please notify
us at once by return email and then delete both messages. We accept no liability for the distribution
of viruses or similar in electronic communications. This notice should not be removed.
> 

Mime
View raw message