hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13065) Add a new interface for retrieving FS and FC Statistics
Date Tue, 10 May 2016 20:34:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278833#comment-15278833

Colin Patrick McCabe commented on HADOOP-13065:

Thanks, [~liuml07].  {{DFSOpsCountStatistics}} is a nice implementation.  It's also nice to
have this for webhdfs as well.

156  @Override
157	  public Long getLong(String key) {
158	    final OpType type = OpType.fromSymbol(key);
159	    return type == null ? 0L : opsCount.get(type).get();
160	  }
I think this should return null in the case where type == null, right?  Indicating that there
is no such statistic.

159	    storageStatistics = (DFSOpsCountStatistics) GlobalStorageStatistics.INSTANCE
160	        .put(DFSOpsCountStatistics.NAME,
161	          new StorageStatisticsProvider() {
162	            @Override
163	            public StorageStatistics provide() {
164	              return new DFSOpsCountStatistics();
165	            }
166	          });
Hmm, I wonder if these StorageStatistics objects should be per-FS-instance rather than per-class?
 I guess let's do that in a follow-on, though, after this gets committed.

+1 once the null thing is fixed

> Add a new interface for retrieving FS and FC Statistics
> -------------------------------------------------------
>                 Key: HADOOP-13065
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13065
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Ram Venkatesh
>            Assignee: Mingliang Liu
>         Attachments: HADOOP-13065-007.patch, HADOOP-13065.008.patch, HADOOP-13065.009.patch,
HADOOP-13065.010.patch, HADOOP-13065.011.patch, HADOOP-13065.012.patch, HDFS-10175.000.patch,
HDFS-10175.001.patch, HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, HDFS-10175.005.patch,
HDFS-10175.006.patch, TestStatisticsOverhead.java
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks. There is
logic within DfsClient to map operations to these counters that can be confusing, for instance,
mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append, createSymlink,
delete, exists, mkdirs, rename and expose them as new properties on the Statistics object.
The operation-specific counters can be used for analyzing the load imposed by a particular
job on HDFS. 
> For example, we can use them to identify jobs that end up creating a large number of
> Once this information is available in the Statistics object, the app frameworks like
MapReduce can expose them as additional counters to be aggregated and recorded as part of
job summary.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message