airavata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eroma (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRAVATA-2639) Experiment in EXECUTING state without progressing due a connection loss exception in GFACServerHandler
Date Wed, 10 Jan 2018 18:42:00 GMT
Eroma created AIRAVATA-2639:
-------------------------------

             Summary: Experiment in EXECUTING state without progressing due a connection loss
exception in GFACServerHandler
                 Key: AIRAVATA-2639
                 URL: https://issues.apache.org/jira/browse/AIRAVATA-2639
             Project: Airavata
          Issue Type: Bug
          Components: GFac
    Affects Versions: 0.18
         Environment: https://dev.seagrid.org
            Reporter: Eroma
            Assignee: Shameera Rathnayaka


In a sequential 50 test job submission one experiment is in EXECUTING without moving forward.
Messages in the log indicates a connection loss.
Log messages for this experiment from gfac log

2018-01-10 16:36:18,610 [pool-3-thread-5] INFO  o.a.a.m.c.impl.ProcessConsumer  -  Message
Received with message id 'LAUNCH.PROCESS-fcc07285-e325-4c08-b31c-7347f4103efb and with message
type:LAUNCHPROCESS, for processId:PROCESS_3eb71ab2-9c86-4e54-a529-9bbc51eecd5e, expId:SLM003-QEspresso-JS20_05520f07-54c6-4695-b164-a6e7a987777f
2018-01-10 16:36:45,557 [pool-3-thread-5] ERROR o.a.a.g.s.GfacServerHandler experiment_id=SLM003-QEspresso-JS20_05520f07-54c6-4695-b164-a6e7a987777f,
gateway_id=seagrid - KeeperErrorCode = ConnectionLoss for /experiments/SLM003-QEspresso-JS20_05520f07-54c6-4695-b164-a6e7a987777f/PROCESS_3eb71ab2-9c86-4e54-a529-9bbc51eecd5e
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /experiments/SLM003-QEspresso-JS20_05520f07-54c6-4695-b164-a6e7a987777f/PROCESS_3eb71ab2-9c86-4e54-a529-9bbc51eecd5e
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:778)
        at org.apache.curator.utils.ZKPaths.mkdirs(ZKPaths.java:232)
        at org.apache.curator.utils.ZKPaths.mkdirs(ZKPaths.java:164)
        at org.apache.airavata.gfac.server.GfacServerHandler.createProcessZKNode(GfacServerHandler.java:350)
        at org.apache.airavata.gfac.server.GfacServerHandler.access$500(GfacServerHandler.java:76)
        at org.apache.airavata.gfac.server.GfacServerHandler$ProcessLaunchMessageHandler.onMessage(GfacServerHandler.java:233)
        at org.apache.airavata.messaging.core.impl.ProcessConsumer.handleDelivery(ProcessConsumer.java:81)
        at com.rabbitmq.client.impl.ConsumerDispatcher$5.run(ConsumerDispatcher.java:144)
        at com.rabbitmq.client.impl.ConsumerWorkService$WorkPoolRunnable.run(ConsumerWorkService.java:99)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message