airavata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eroma (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (AIRAVATA-1354) Job monitor for Stampede unknow status
Date Thu, 18 Jun 2015 13:57:04 GMT

     [ https://issues.apache.org/jira/browse/AIRAVATA-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eroma closed AIRAVATA-1354.
---------------------------
       Resolution: Cannot Reproduce
    Fix Version/s: 0.15 

Tested experiments in http://dev.test-drive.airavata.org/portal/ultrascan-testing/public/
and couldn't reproduce it

> Job monitor for Stampede unknow status
> --------------------------------------
>
>                 Key: AIRAVATA-1354
>                 URL: https://issues.apache.org/jira/browse/AIRAVATA-1354
>             Project: Airavata
>          Issue Type: Improvement
>          Components: GFac
>            Reporter: Raminderjeet Singh
>            Assignee: Shameera Rathnayaka
>              Labels: Monitoring
>             Fix For: 0.15 
>
>
> We should using experiment id to name the jobs for unique identifier and then use that
job name to identify if the job get to unknown status. If the job still is in unknown state
we should check in working directory for stdout/err and make corrective action to correct
the UNKNOWN statues. Same logic will be useful for job recovery if Airavata server restart.
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message