airavata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eroma (JIRA)" <>
Subject [jira] [Created] (AIRAVATA-2736) Job submitted and running in HPC while the experiment is tagged as FAILED
Date Tue, 03 Apr 2018 19:44:00 GMT
Eroma created AIRAVATA-2736:

             Summary: Job submitted and running in HPC while the experiment is tagged as FAILED
                 Key: AIRAVATA-2736
             Project: Airavata
          Issue Type: Bug
          Components: helix implementation
    Affects Versions: 0.18
         Environment: - Helix test env
            Reporter: Eroma
            Assignee: Dimuthu Upeksha
             Fix For: 0.18

# Submitted an experiment which then submitted the job.
 # Job ID is returned and the status is ACTIVE.
 # Due to zookeeper connection issue the experiment is FAILED.
 # The job is still running in HPC
 # Airavata is not waiting for job monitoring as the task status is not updated in the zookeeper.
 # error in log [1]
 # SLM001-AmberSander-BR2_5ed5a19f-ab44-4eba-afb7-1feafaf0bbdd - exp ID

|org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /monitoring/2159926/lock at org.apache.zookeeper.KeeperException.create(
at org.apache.zookeeper.KeeperException.create( at org.apache.zookeeper.ZooKeeper.create(
at org.apache.curator.framework.imps.CreateBuilderImpl$
at org.apache.curator.framework.imps.CreateBuilderImpl$
at org.apache.curator.RetryLoop.callWithRetry( at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(
at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(
at org.apache.airavata.helix.impl.task.submission.JobSubmissionTask.createMonitoringNode(
at org.apache.airavata.helix.impl.task.submission.DefaultJobSubmissionTask.onRun(
at org.apache.airavata.helix.impl.task.AiravataTask.onRun( at
at at java.util.concurrent.Executors$
at at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(
at java.util.concurrent.ScheduledThreadPoolExecutor$
at java.util.concurrent.ThreadPoolExecutor.runWorker( at java.util.concurrent.ThreadPoolExecutor$

This message was sent by Atlassian JIRA

View raw message