hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13065) Add a new interface for retrieving FS and FC Statistics
Date Wed, 04 May 2016 11:10:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270485#comment-15270485

Steve Loughran commented on HADOOP-13065:

ALong has contention under load; it doesn't compile down to a simple x86 {{LOCK XADD $address,
value}}, which is all you need. The updater stuff is designed to get closer to this with something
eventually generated code like

address = &object + offset-to-field
LOCK XADD $address, value

(no, I can't code x86 ASM. I just spent lots of time staring at it trying to debug windows
C++ code)

The Java 9 stuff is actually intended to something really profound: provide the same operations
against arrays, so {{address = array + (aligned) offset}}, C's   *(array+offset) ++ were the
type of array some atomic long[].

The fencing stuff is just there to make people who think they understand memory models, CPU
and compiler re-ordering & the like write fast code. I've only ever seen anyone argue
for doing that in user level code once [Twitter: eventually consistent data structures|https://vimeo.com/43903960]
and their code only worked because they didn't understand that in Java 5+, volatile reads
are non-reorderable fences on all accesses. (that is: he gets the explanation of why things
work wrong). Nobody should be going near that in the Hadoop code at all.

> Add a new interface for retrieving FS and FC Statistics
> -------------------------------------------------------
>                 Key: HADOOP-13065
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13065
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Ram Venkatesh
>            Assignee: Mingliang Liu
>         Attachments: HADOOP-13065-007.patch, HADOOP-13065.008.patch, HDFS-10175.000.patch,
HDFS-10175.001.patch, HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, HDFS-10175.005.patch,
HDFS-10175.006.patch, TestStatisticsOverhead.java
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks. There is
logic within DfsClient to map operations to these counters that can be confusing, for instance,
mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append, createSymlink,
delete, exists, mkdirs, rename and expose them as new properties on the Statistics object.
The operation-specific counters can be used for analyzing the load imposed by a particular
job on HDFS. 
> For example, we can use them to identify jobs that end up creating a large number of
> Once this information is available in the Statistics object, the app frameworks like
MapReduce can expose them as additional counters to be aggregated and recorded as part of
job summary.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message