airavata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dimuthu Upeksha (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRAVATA-2742) Helix Controller throws an Exception when the participant is killed
Date Mon, 09 Apr 2018 14:38:00 GMT
Dimuthu Upeksha created AIRAVATA-2742:
-----------------------------------------

             Summary: Helix Controller throws an Exception when the participant is killed
                 Key: AIRAVATA-2742
                 URL: https://issues.apache.org/jira/browse/AIRAVATA-2742
             Project: Airavata
          Issue Type: Bug
          Components: helix implementation
    Affects Versions: 0.18
            Reporter: Dimuthu Upeksha


This was a sporadic issue and occurred only once in the test setup. There were 5 - 10 tasks
running in the Participant and Participant was externally killed by SIGTERM command (kill
<process-id>. Once the Participant is started again, it did not pickup the tasks that
it was running at the time it was killed. Surprisingly, the status of the respective workflows
were IN_PROGRESS status. Helix Controller log showed following error for each Workflow. This
seems like a bug in Helix and I posted the issue in Helix mailing list (Subject : Sporadic
issue when restarting a Participant). 

 
2018-04-06 15:10:57,766 [Thread-3] ERROR o.a.h.c.s.BestPossibleStateCalcStage  - Error computing
assignment for resource Workflow_of_process_PROCESS_7f6c8a54-b50f-4bdb-aafd-59ce87276527-POST-b5e39e07-2d8e-4309-be5a-f5b6067f9a24_TASK_cc8039e5-f054-4dea-8c7f-07c98077b117.
Skipping.
java.lang.NullPointerException: Name is null
        at java.lang.Enum.valueOf(Enum.java:236)
        at org.apache.helix.task.TaskPartitionState.valueOf(TaskPartitionState.java:25)
        at org.apache.helix.task.JobRebalancer.computeResourceMapping(JobRebalancer.java:272)
        at org.apache.helix.task.JobRebalancer.computeBestPossiblePartitionState(JobRebalancer.java:140)
        at org.apache.helix.controller.stages.BestPossibleStateCalcStage.compute(BestPossibleStateCalcStage.java:171)
        at org.apache.helix.controller.stages.BestPossibleStateCalcStage.process(BestPossibleStateCalcStage.java:66)
        at org.apache.helix.controller.pipeline.Pipeline.handle(Pipeline.java:48)
        at org.apache.helix.controller.GenericHelixController.handleEvent(GenericHelixController.java:295)
        at org.apache.helix.controller.GenericHelixController$ClusterEventProcessor.run(GenericHelixController.java:595)
2018-04-06 15:11:00,385 [Thread-3] ERROR o.a.h.c.s.BestPossibleStateCalcStage  - Error computing
assignment for resource Workflow_of_process_PROCESS_2b69b499-c527-4c9d-8b2b-db17366f5f81-POST-c67607ae-9177-4a02-af8a-8b3751eea4ff_TASK_1ea6876d-f2ec-4139-a15d-0e64a80a3025.
Skipping. 
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message