[ https://issues.apache.org/jira/browse/AMBARI-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vitaly Brodetskyi updated AMBARI-1534:
--------------------------------------
Description:
Each host in the cluster runs ambari-agent.
There should be a Nagios alert to that watches the ambari-agent process. Since the system
does not allow direct communication to an ambari-agent, this check should either a) check
the process running on the host or b) ping the Ambari Server REST API to confirm agent is
still heartbeat'ing.
This alert should be shown with each Hosts >
{host}
in Ambari Web.
Service Description: Ambari Agent (ambari-agent) process down
Service Group: AMBARI
Check / Retry Interval: 0.25
Note: need to add new service group AMBARI for Nagios
was:
There should be a Nagios alert to that watches the ambari-server process. This check should
either a) check the process running on the host or b) ping the Ambari Server REST API for
status.
Service Description: Ambari Server (ambari-server) process down
Service Group: AMBARI
Check / Retry Interval: 0.25
Note: need to add new service group AMBARI for Nagios
Question: Should this alert be shown in Ambari Web, since if this alert goes crit, ambari
web is down? I think so and this host check should display with the host that is Ambari Server
(Hosts > host) so people can see the check is occurring properly.
> Add Nagios check for ambari-agent process for each host in the cluster
> ----------------------------------------------------------------------
>
> Key: AMBARI-1534
> URL: https://issues.apache.org/jira/browse/AMBARI-1534
> Project: Ambari
> Issue Type: Improvement
> Reporter: Jeff Sposetti
> Assignee: Vitaly Brodetskyi
> Fix For: 1.4.1
>
> Attachments: AMBARI-1534.patch
>
>
> Each host in the cluster runs ambari-agent.
> There should be a Nagios alert to that watches the ambari-agent process. Since the system
does not allow direct communication to an ambari-agent, this check should either a) check
the process running on the host or b) ping the Ambari Server REST API to confirm agent is
still heartbeat'ing.
> This alert should be shown with each Hosts >
> {host}
> in Ambari Web.
> Service Description: Ambari Agent (ambari-agent) process down
> Service Group: AMBARI
> Check / Retry Interval: 0.25
> Note: need to add new service group AMBARI for Nagios
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
|