tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kuhu Shukla (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TEZ-4027) DagAwareYarnTaskScheduler can miscompute blocked vertices and cause a hang
Date Fri, 14 Dec 2018 21:38:00 GMT
Kuhu Shukla created TEZ-4027:
--------------------------------

             Summary: DagAwareYarnTaskScheduler can miscompute blocked vertices and cause
a hang
                 Key: TEZ-4027
                 URL: https://issues.apache.org/jira/browse/TEZ-4027
             Project: Apache Tez
          Issue Type: Bug
    Affects Versions: 0.9.1, 0.10.0
            Reporter: Kuhu Shukla
            Assignee: Kuhu Shukla


In a scenario where there are retro active failures and the YARN queue is full to not allow
more new container assignments, the scheduler can miscompute blocked vertex set as it tries
to flip the bits upto the length of the bitset which may not be reflective of the total number
of vertices. This causes no preemption and the DAG will hang.

{code}
@GuardedBy("DagAwareYarnTaskScheduler.this")
    BitSet createVertexBlockedSet() {
      BitSet blocked = new BitSet();
      Entry<Priority, RequestPriorityStats> entry = priorityStats.lastEntry();
      if (entry != null) {
        RequestPriorityStats stats = entry.getValue();
        blocked.or(stats.allowedVertices);
        blocked.flip(0, blocked.length());
        blocked.or(stats.descendants);
      }
      return blocked;
    }
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message