uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry Cwiklik (JIRA)" <...@uima.apache.org>
Subject [jira] [Created] (UIMA-5528) UIMA-DUCC: improve agent monitoring of cgroups
Date Mon, 07 Aug 2017 15:35:00 GMT
Jerry Cwiklik created UIMA-5528:

             Summary: UIMA-DUCC: improve agent monitoring of cgroups 
                 Key: UIMA-5528
                 URL: https://issues.apache.org/jira/browse/UIMA-5528
             Project: UIMA
          Issue Type: Improvement
          Components: DUCC
            Reporter: Jerry Cwiklik
            Assignee: Jerry Cwiklik
             Fix For: future-DUCC

Currently agent performs node cgroup validation at startup only. In older versions of RedHat
it has been observed that cgroup memory subsystem disappears due to the OS bug. Subsequently
all jobs fail due to cgroup creation failure.

Modify agent monitoring of a node by trying to test cgroup creation at regular intervals.
This check should be part of the node metrics collection. If the cgroup creation fails, the
agent should mark the state of cgroups as 'Broken'. This new state will be displayed by duccmon.

This message was sent by Atlassian JIRA

View raw message