uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry Cwiklik (JIRA)" <...@uima.apache.org>
Subject [jira] [Reopened] (UIMA-5048) DUCC Orchestrator (OR) record Process Manager (PM) Job CommandLine requests
Date Wed, 24 Aug 2016 16:37:20 GMT

     [ https://issues.apache.org/jira/browse/UIMA-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jerry Cwiklik reopened UIMA-5048:
---------------------------------

Need to revisit this issue. When the PM is unable to connect to the OR, it should throw away
the OR state. 

Check the code to see if the http wrapper in the PM is handling connectivity problems. It
looks like it is eating an exception now and returning null for job details. Modify to make
the code throw an exception which should be caught by the PM component and handled as described
above

In a scenario where the OR returns null for job details, the PM should continue what it is
doing now which is send an update to agent where a missing cmdline is detected and process
is marked as FAILED and reason=MissingCommandLine 

> DUCC Orchestrator (OR) record Process Manager (PM) Job CommandLine requests
> ---------------------------------------------------------------------------
>
>                 Key: UIMA-5048
>                 URL: https://issues.apache.org/jira/browse/UIMA-5048
>             Project: UIMA
>          Issue Type: Bug
>          Components: DUCC
>            Reporter: Lou DeGenaro
>            Assignee: Jerry Cwiklik
>             Fix For: 2.2.0-Ducc
>
>
> On uima-ducc-demo we saw one Job that caused PM to OOM.  According to the PM log, the
request for Job 784 from PM to Orchestrator to fetch the CommandLine (comprising the CLASSPATH)
resulted in the unexpected value of null.
> 1. Put more logging code into OR to better understand why a null value was returned to
PM
> 2. PM should prevent such Jobs Processes from launching, since there is no command line
> 3. Increase PM's -Xmx on uima-ducc-demo from 150M to 200M (same as SM and WS)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message