hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10079) log a warning message if group resolution takes too long.
Date Fri, 01 Nov 2013 20:55:17 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811649#comment-13811649

Colin Patrick McCabe commented on HADOOP-10079:

bq. Can you comment on how you chose the default value of 30s?

I was basing this timeout on {{ha.health-monitor.rpc-timeout.ms}}, which is currently 45000
(45 seconds).  I'll change it to 5 seconds in the interest of detecting more performance problems.

bq. Do we need that new startMs variable when we have already have now (which could be renamed
startMs? I don't think the cached check contributes meaningfully compared to the 30s timeout.

true, we only need to call {{Time#monotonicNow}} once

bq. Also, while we're in this file, how about moving the hardcoded default value of HADOOP_SECURITY_GROUPS_CACHE_SECS
into CommonConfigurationKeys?


bq. Since we're now measure group resolution time, should we perhaps put this into a percentile
metric as well?

I took a look at doing this, but it got complex.  For one thing, {{dfs.metrics.percentiles.intervals}}
is currently HDFS-only, not in common.  For another, the way the {{Groups}} singleton is created
doesn't take into account the process name, which we would need to add the right metrics.
 Let's open a follow-up JIRA for this.

> log a warning message if group resolution takes too long.
> ---------------------------------------------------------
>                 Key: HADOOP-10079
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10079
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 2.2.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HADOOP-10079.001.patch, HADOOP-10079.002.patch
> We should log a warning message if group resolution takes too long.

This message was sent by Atlassian JIRA

View raw message