ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qin Liu <qinliu5...@gmail.com>
Subject Re: Most widgets on HDFS,YARN,HBase,Storm,and Kafka summary pages show NA
Date Thu, 04 May 2017 05:15:48 GMT
Hi Aravindan,

The collector is up and running and the metrics on ambari-metrics service
summary page looks good.

The following exception/errors are in ambari-server.log:
03 May 2017 20:46:34,451 ERROR [ambari-client-thread-210]
MetricsRequestHelper:116 - Error getting timeline metrics : Read timed out
03 May 2017 20:46:34,452 DEBUG [ambari-client-thread-210]
MetricsRequestHelper:118 - Error getting timeline metrics : Read timed out
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:170)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704)
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)
        at
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569)
        at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
        at
java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
        at
org.apache.ambari.server.controller.internal.URLStreamProvider.processURL(URLStreamProvider.java:218)
        at
org.apache.ambari.server.controller.internal.URLStreamProvider.processURL(URLStreamProvider.java:142)
        at
org.apache.ambari.server.controller.metrics.timeline.MetricsRequestHelper.fetchTimelineMetrics(MetricsRequestHelper.java:79)
...
03 May 2017 20:46:34,452 ERROR [ambari-client-thread-210]
MetricsRequestHelper:123 - Cannot connect to collector:
SocketTimeoutException for qin1.example.com
03 May 2017 20:46:34,453 DEBUG [ambari-client-thread-210]
TimelineMetricCacheEntryFactory:88 - Caught IOException on fetching
metrics. Read timed out
03 May 2017 20:46:34,453 DEBUG [ambari-client-thread-210]
MetricsPropertyProvider:537 - Skip populating resources on socket timeout.
03 May 2017 20:46:34,453 DEBUG [pool-3-thread-1]
MetricsCollectorHAManager:87 - MetricsCollectorHostDownEvent caught, Down
collector : qin1.example.com

In ambari-metrics collector log, there are tons of Call exceptions, e.g.,
2017-05-01 22:42:37,698 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=13,
retries=35, started=128676 ms ago, cancelled=false, msg=row
'cpu_idle^@nodemanager' on table 'METRIC_AGGREGATE' at
region=METRIC_AGGREGATE,,1493703270057.c9fc6ac679c986905075ba50cf634531.,
hostname=qin1.example.com,52106,1493703207689, seqNum=2
2017-05-01 22:42:40,310 INFO org.apache.hadoop.hbase.client.AsyncProcess:
#1, waiting for 5054  actions to finish
2017-05-01 22:42:41,977 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=11,
retries=35, started=88292 ms ago, cancelled=false, msg=row
'kafka.controller.ControllerStats.LeaderElectionRateAndTimeMs.1MinuteRate^@kafka_broker'
on table 'METRIC_AGGREGATE' at
region=METRIC_AGGREGATE,,1493703270057.c9fc6ac679c986905075ba50cf634531.,
hostname=qin1.example.com,52106,1493703207689, seqNum=2
2017-05-01 22:42:44,716 INFO
org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=13,
retries=35, started=128792 ms ago, cancelled=false, msg=row
'cpu_idle^@hbase' on table 'METRIC_AGGREGATE' at
region=METRIC_AGGREGATE,,1493703270057.c9fc6ac679c986905075ba50cf634531.,
hostname=qin1.example.com,52106,1493703207689, seqNum=2
...

Need to mention that
YARN/components/NODEMANAGER?fields=metrics/cpu/cpu_idle,
HBASE/components/HBASE_REGIONSERVER?fields=metrics/cpu/cpu_nice, and

metrics/kafka/controller/ControllerStats/LeaderElectionRateAndTimeMs/1MinuteRate
are not available.

Thanks,

Qin


On Wed, May 3, 2017 at 12:16 PM, Aravindan Vijayan <avijayan@hortonworks.com
> wrote:

> Hi Qin,
>
> Anything from the ambari-server logs that might point to the issue? Is the
> metrics collector component up and running?
>
>
> Also, AMBARI-20622 looks like the last commit that went into the
> Ambari-2.5.0.3 release. This could be off by a commit or two since there is
> no git tag for that release.
> --
> Thanks and Regards,
> Aravindan Vijayan
>
>
>
>
>
>
>
> On 5/2/17, 10:20 AM, "Qin Liu" <qinliu5678@gmail.com> wrote:
>
> >Hi @avijayan <https://reviews.apache.org/users/avijayan/>, @swagle
> ><https://reviews.apache.org/users/swagle/>, @dsen, and all,
> ><https://reviews.apache.org/users/dsen/>
> >
> >Can someone also tell me what is the last commit for building
> >http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.5.0.3/
> >ambari.repo?
> >
> >Thanks
> >
> >On Mon, May 1, 2017 at 11:19 PM, Qin Liu <qinliu5678@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> Does anyone have this metrics issue "Most widgets on
> >> HDFS,YARN,HBase,Storm,and Kafka Summary pages show NA" with latest trunk
> >> and latest branch-2.5?
> >>
> >> I am having this issue with ambari RPMs I built with latest trunk and
> >> latest branch-2.5 using the following command:
> >>
> >> mvn -B clean install package rpm:rpm -Dbuild-rpm -DskipTests
> >> -Dpython.ver="python >= 2.6"
> >>
> >> But I don't have this issue if I use http://public-repo-1.
> >> hortonworks.com/ambari/centos6/2.x/updates/2.5.0.3/ambari.repo.
> >>
> >> Can anyone shed a light on this!
> >>
> >> Thanks
> >> Qin
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message