airavata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dimuthu Upeksha (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (AIRAVATA-2956) Possible race condition in job monitoring
Date Sun, 25 Nov 2018 15:26:00 GMT

     [ https://issues.apache.org/jira/browse/AIRAVATA-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dimuthu Upeksha resolved AIRAVATA-2956.
---------------------------------------
    Resolution: Fixed

Added validation logic into AbstactParser before putting a job status into the job status
queue

If the validation fails, Email Monitor keeps the emails unread until a given period of time

> Possible race condition in job monitoring
> -----------------------------------------
>
>                 Key: AIRAVATA-2956
>                 URL: https://issues.apache.org/jira/browse/AIRAVATA-2956
>             Project: Airavata
>          Issue Type: Bug
>          Components: helix implementation
>            Reporter: Dimuthu Upeksha
>            Assignee: Dimuthu Upeksha
>            Priority: Major
>
> When Job submission task submits a job to a compute resource, it returns a job id and
then it is saved in a zookeeper path for post workflow execution. But in some cases, job completes
before those metadata is saved in zookeeper and then post workflow fails. 
> 018-11-21 18:15:55,783 [main] INFO  o.a.a.h.i.w.PostWorkflowManager  - Processing
job result of job id 9839 sent by EmailBasedProducer
> 2018-11-21 18:15:55,785 [main] WARN  o.a.a.h.i.w.PostWorkflowManager  - Could
not find a monitoring register for job id 9839
> 2018-11-21 18:15:55,785 [main] INFO  o.a.a.h.i.w.PostWorkflowManager  - Status
of processing 9839 : false



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message