hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manuel Sopena Ballesteros <manuel...@garvan.org.au>
Subject Re: regionserver can't connect to master
Date Mon, 23 Mar 2020 05:14:58 GMT
Hi Jasani,


Which HBase version are you using?

[luffy@gl-hdp-ctrl03 ~]$ hbase version

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/phoenix/phoenix-5.0.0.3.1.0.0-78-server.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

HBase 2.0.2.3.1.0.0-78

Source code repository git://ctr-e138-1518143905142-586755-01-000023.hwx.site/grid/0/jenkins/workspace/HDP-parallel-centos7/SOURCES/hbase
revision=

Compiled by jenkins on Thu Dec  6 12:27:45 UTC 2018

>From source with checksum 015c34650c163b249d16fc7e496a030e


You are bringing up fresh cluster and not doing an upgrade right?

Yes this is a fresh cluster I am deploying through ambari blueprints (I always reset ambari
to factory settings before deploy the blueprint)


Has Ambari successfully brought up NameNodes and DataNodes?

I think so

[cid:0cef77dd-f616-45ef-8214-e0bb0006b665]


How-many components are already running so far?

{

  "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services",

  "items" : [

    {

      "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/AMBARI_METRICS",

      "ServiceInfo" : {

        "cluster_name" : "Grandline",

        "service_name" : "AMBARI_METRICS"

      }

    },

    {

      "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/HBASE",

      "ServiceInfo" : {

        "cluster_name" : "Grandline",

        "service_name" : "HBASE"

      }

    },

    {

      "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/HDFS",

      "ServiceInfo" : {

        "cluster_name" : "Grandline",

        "service_name" : "HDFS"

      }

    },

    {

      "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/HIVE",

      "ServiceInfo" : {

        "cluster_name" : "Grandline",

        "service_name" : "HIVE"

      }

    },

    {

      "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/MAPREDUCE2",

      "ServiceInfo" : {

        "cluster_name" : "Grandline",

        "service_name" : "MAPREDUCE2"

      }

    },

    {

      "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/SMARTSENSE",

      "ServiceInfo" : {

        "cluster_name" : "Grandline",

        "service_name" : "SMARTSENSE"

      }

    },

    {

      "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/SPARK2",

      "ServiceInfo" : {

        "cluster_name" : "Grandline",

        "service_name" : "SPARK2"

      }

    },

    {

      "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/TEZ",

      "ServiceInfo" : {

        "cluster_name" : "Grandline",

        "service_name" : "TEZ"

      }

    },

    {

      "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/YARN",

      "ServiceInfo" : {

        "cluster_name" : "Grandline",

        "service_name" : "YARN"

      }

    },

    {

      "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/ZEPPELIN",

      "ServiceInfo" : {

        "cluster_name" : "Grandline",

        "service_name" : "ZEPPELIN"

      }

    },

    {

      "href" : "http://10.0.1.245:8080/api/v1/clusters/Grandline/services/ZOOKEEPER",

      "ServiceInfo" : {

        "cluster_name" : "Grandline",

        "service_name" : "ZOOKEEPER"

      }

    }

  ]

}


Are they connected(e.g. NN and DN) and only RS is having trouble connecting to HM?

Yes, this is my understanding


Although telnet seems correct, can you also try "nc -zv gl-hdp-ctrl03.local 16000" from RS
just to double check?

$ nc -zv gl-hdp-ctrl03.local 16000

Ncat: Version 7.50 ( https://nmap.org/ncat )

Ncat: Connected to 192.168.20.248:16000.

Ncat: 0 bytes sent, 0 bytes received in 0.02 seconds.


thank you

________________________________
From: Viraj Jasani <vjasani@apache.org>
Sent: Sunday, 22 March 2020 2:47:09 AM
To: user@hbase.apache.org
Subject: Re: regionserver can't connect to master

Which HBase version are you using? You are bringing up fresh cluster and not doing an upgrade
right? Has Ambari successfully brought up NameNodes and DataNodes? How-many components are
already running so far? Are they connected(e.g. NN and DN) and only RS is having trouble connecting
to HM? Although telnet seems correct, can you also try "nc -zv gl-hdp-ctrl03.local 16000"
from RS just to double check?
Thanks

On 2020/03/20 23:45:28, Manuel Sopena Ballesteros <manuel.sb@garvan.org.au> wrote:
> Dear HBase community,
>
> I am having an issue with my ambari hbase deployment where regionserver is not able to
connect to master
>
> Hbase Master log files:
> 2020-03-21 02:36:53,614 INFO [Thread-16] master.ServerManager: Waiting on regionserver
count=0; waited=3174901ms, expecting min=1 server(s), max=NO_LIMIT server(s), timeout=30000ms,
lastChange=-3174901ms
> 2020-03-21 02:36:54,287 WARN [master/gl-hdp-ctrl03:16000] assignment.AssignmentManager:
No servers available; cannot place 1 unassigned regions.
>
> Hbase region server logs:
> Caused by: java.net.ConnectException: Call to gl-hdp-ctrl03.local/192.168.20.248:16000
failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException:
connection timed out: gl-hdp-ctrl03.local/192.168.20.248:16000
> at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:166)
> at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
> at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95)
> at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410)
> at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406)
> at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:103)
> at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:118)
>
>
> Test connectivity from region server to master
> $ telnet gl-hdp-ctrl03.local 16000
> Trying 192.168.20.248...
> Connected to gl-hdp-ctrl03.local.
> Escape character is '^]'.
>
> Any idea of why region can't connect?
>
> Thank you very much
> NOTICE
> Please consider the environment before printing this email. This message and any attachments
are intended for the addressee named and may contain legally privileged/confidential/copyright
information. If you are not the intended recipient, you should not read, use, disclose, copy
or distribute this communication. If you have received this message in error please notify
us at once by return email and then delete both messages. We accept no liability for the distribution
of viruses or similar in electronic communications. This notice should not be removed.
>

NOTICE
Please consider the environment before printing this email. This message and any attachments
are intended for the addressee named and may contain legally privileged/confidential/copyright
information. If you are not the intended recipient, you should not read, use, disclose, copy
or distribute this communication. If you have received this message in error please notify
us at once by return email and then delete both messages. We accept no liability for the distribution
of viruses or similar in electronic communications. This notice should not be removed.

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message