hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hong Zhiguo (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-4104) dryrun of schedule for diagnostic and tenant's complain
Date Wed, 02 Sep 2015 01:41:45 GMT
Hong Zhiguo created YARN-4104:
---------------------------------

             Summary: dryrun of schedule for diagnostic and tenant's complain
                 Key: YARN-4104
                 URL: https://issues.apache.org/jira/browse/YARN-4104
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: scheduler
            Reporter: Hong Zhiguo
            Assignee: Hong Zhiguo
            Priority: Minor


We have more than 1 thousand queues and several handreds of tenants in a busy cluster. We
get a lot of complains/questions from owner/operator of queues about "Why my queue/app can't
get resource for a long while? "

It's realy hard to answer such questions.

So we added an diagnostic REST endpoint "/ws/v1/cluster/schedule/dryrun/{parentQueueName}"
which returns the sorted list of it's children according to it's SchedulingPolicy.getComparator().
 All scheduling parameters of the chidren are also displayed, such as minShare, usage, demand,
weight, priority etc.
Usually we just call "/ws/v1/cluster/schedule/root", and the result self-explains to the questions.
I feel it's really usefull for multi-tenant clusters, and hope it could be merged into the
mainline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message