helix-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HELIX-636) JobQueue capacity is full due to job not cleaned up
Date Thu, 27 Oct 2016 23:17:58 GMT

    [ https://issues.apache.org/jira/browse/HELIX-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613616#comment-15613616
] 

ASF GitHub Bot commented on HELIX-636:
--------------------------------------

Github user lei-xia commented on a diff in the pull request:

    https://github.com/apache/helix/pull/55#discussion_r85445965
  
    --- Diff: helix-core/src/main/java/org/apache/helix/task/WorkflowRebalancer.java ---
    @@ -496,11 +496,10 @@ private void cleanupJob(final String job, String workflow) {
     
         // Delete job context
         // For recurring workflow, it's OK if the node doesn't exist.
    -    String propStoreKey = TaskUtil.getWorkflowContextKey(job);
    -    if (!_manager.getHelixPropertyStore().remove(propStoreKey, AccessOption.PERSISTENT))
{
    +    if (!TaskUtil.removeJobContext(_manager, job)) {
           LOG.warn(String.format(
               "Error occurred while trying to clean up job %s. Failed to remove node %s from
Helix.",
    -          job, propStoreKey));
    +          job, TaskUtil.getWorkflowContextKey(job)));
    --- End diff --
    
    do not need TaskUtil.getWorkflowContextKey(job). just log something like: "Error ... when
clean up job's workflow context..."


> JobQueue capacity is full due to job not cleaned up
> ---------------------------------------------------
>
>                 Key: HELIX-636
>                 URL: https://issues.apache.org/jira/browse/HELIX-636
>             Project: Apache Helix
>          Issue Type: Bug
>          Components: helix-core
>    Affects Versions: 0.6.4
>            Reporter: Junkai Xue
>            Assignee: Junkai Xue
>             Fix For: 0.6.x
>
>
> Since JobQueue never cleans up the jobs in the final stage (FAILED, COMPLETED, ABORTED),
JobQueue will reach the capacity limit when jobs keep adding in.
> Fix: Current fix will be adding an API in TaskDriver to clean up the final statge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message