hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
Date Wed, 11 May 2016 21:32:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280865#comment-15280865

Chris Nauroth commented on HADOOP-13028:

I'm in favor of including the stream statistics in {{S3AInputStream#toString}}.  This is an
extension of the stream state already provided.  I would like us to have the ability to evolve
{{toString}} output for improved diagnostics like this.

Typical Java best practices advise using {{toString}} output as a debugging aid, not as a
stable format suitable for UI display or object serialization.  HDFS-9732 is an example of
a patch where I have advised against using {{toString}} as a serialization format and recommended
migrating to a different method that can provide a stability guarantee.  In the future, I
will strongly consider -1'ing patches that introduce these kinds of dependencies on {{toString}}

While reflection-based approaches are viable, especially with some helpful libraries, I've
never heard of those projects' contributors saying that they like writing their code that
way.  Instead, I tend to hear that it makes their code more awkward or introduces potential
performance risks for the extra indirection.

Another consideration is integration with logging.  SLF4J makes it easy to pass along template
arguments, and then SLF4J will lazily call {{toString}} based on the configured logging level.
 If the output is hidden behind a different method, or even requires reflection to access
it, then applications will have to go back to coding their own conditional checks on the log
level to avoid potentially costly method calls.

> add low level counter metrics for S3A; use in read performance tests
> --------------------------------------------------------------------
>                 Key: HADOOP-13028
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13028
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3, metrics
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, HADOOP-13028-004.patch,
HADOOP-13028-005.patch, HADOOP-13028-006.patch, HADOOP-13028-007.patch, HADOOP-13028-008.patch,
HADOOP-13028-009.patch, HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch,
HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt,
> against S3 (and other object stores), opening connections can be expensive, closing connections
may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of open/close/failure+reconnect
operations, timers of how long things take. This can be used downstream to measure efficiency
of the code (how often connections are being made), connection reliability, etc.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message