uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry Cwiklik (JIRA)" <...@uima.apache.org>
Subject [jira] [Created] (UIMA-5794) DUCC: Agent fails to stop processes
Date Wed, 13 Jun 2018 14:18:00 GMT
Jerry Cwiklik created UIMA-5794:
-----------------------------------

             Summary: DUCC: Agent fails to stop processes
                 Key: UIMA-5794
                 URL: https://issues.apache.org/jira/browse/UIMA-5794
             Project: UIMA
          Issue Type: Bug
          Components: DUCC
            Reporter: Jerry Cwiklik
            Assignee: Jerry Cwiklik
             Fix For: 2.2.3-Ducc


Agent does not stop running processes sometimes. In a specific case, the agent left a few
processes running even though these processes state were set to Stopping.

[Process Type=Pop DUCC ID=348 PID=17099 State=Stopping Resident Memory=361656320 GC Total=-1
GC Time=-1 Init Stats List Size:0 Reason: JPHasNoActiveJob] Exit Code=0
 [Process Type=Pop DUCC ID=364 PID=593 State=Stopping Resident Memory=7382974464 GC Total=-1
GC Time=-1 Init Stats List Size:0 Reason: JPHasNoActiveJob] Exit Code=0

For some reason Agent failed to send SIGKILL after SIGTERM failed to stop them. Since these
processes used a lot of memory, the OS killer ended up killing legit processes to keep the
node from running out of memory.

Since agent logs wrapped the evidence of what happened has been lost.

Modify agent to keep sending SIGKILL to processes in Stopping state after some time lapses.
Perhaps rogue process detector can be tasked with that.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message