tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wang Yan (Jira)" <j...@apache.org>
Subject [jira] [Created] (TEZ-4110) Make Tez fail fast when DFS quota is exceeded
Date Mon, 23 Dec 2019 04:00:00 GMT
Wang Yan created TEZ-4110:
-----------------------------

             Summary: Make Tez fail fast when DFS quota is exceeded
                 Key: TEZ-4110
                 URL: https://issues.apache.org/jira/browse/TEZ-4110
             Project: Apache Tez
          Issue Type: Improvement
         Environment: hadoop 2.9, hive 2.3, tez
 
            Reporter: Wang Yan


This ticket aims at creating a similar feature as MAPREDUCE-7148 in tez.

Make a tez job fail fast when dfs quota limitation is reached.

The background is : We are running hive jobs with a DFS quota limitation per job(3TB). If
a job hits DFS quota limitation, the task that hit it will fail and there will be a few task
reties before the job actually fails. The retry is not very helpful because the job will always
fail anyway. In some worse cases, we have a job which has a single reduce task writing more
than 3TB to HDFS over 20 hours, the reduce task exceeds the quota limitation and retries 4
times until the job fails in the end thus consuming a lot of unnecessary resource. This ticket
aims at providing the feature to let a job fail fast when it writes too much data to the DFS
and exceeds the DFS quota limitation.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message