tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prabhu Joseph (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TEZ-3756) Tez Query fails because of a weed node and all four attempts are placed on same node
Date Fri, 09 Jun 2017 11:25:18 GMT
Prabhu Joseph created TEZ-3756:
----------------------------------

             Summary: Tez Query fails because of a weed node and all four attempts are placed
on same node
                 Key: TEZ-3756
                 URL: https://issues.apache.org/jira/browse/TEZ-3756
             Project: Apache Tez
          Issue Type: Bug
    Affects Versions: 0.7.1
            Reporter: Prabhu Joseph


Tez query fails due to a task failing on all four attempts with "Error: Could not find or
load main class org.apache.tez.runtime.task.TezChild". There is a weed node where all containers
are failing with this error. Tez library tez.tar.gz cached is corrupt on that machine. But
the concern is all the four attempts are placed on same problematic node. 

{code}
HW12691:TEZ pjoseph$ cat application_1495721159191_10342.log | grep attempt_1495721159191_10342_6_00_001808
| grep "Assigning container"

Assigning container to task: containerId=container_1495721159191_10342_01_000395, task=attempt_1495721159191_10342_6_00_001808_0

Assigning container to task: containerId=container_1495721159191_10342_01_000397, task=attempt_1495721159191_10342_6_00_001808_1

Assigning container to task: containerId=container_1495721159191_10342_01_000399, task=attempt_1495721159191_10342_6_00_001808_2

Assigning container to task: containerId=container_1495721159191_10342_01_000401, task=attempt_1495721159191_10342_6_00_001808_3

All the four containers are placed on same nodemanager

Container: container_1495721159191_10342_01_000395 on xxx_45454
Container: container_1495721159191_10342_01_000397 on xxx_45454
Container: container_1495721159191_10342_01_000399 on xxx_45454
Container: container_1495721159191_10342_01_000401 on xxx_45454

{code}






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message