hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
Date Mon, 02 May 2016 20:08:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267400#comment-15267400

Chris Nauroth commented on HADOOP-13028:

Hello [~stevel@apache.org].  I'm still digging into the changes in the seek code and the tests,
but I'd like to share the feedback I have so far for patch v008.

# Let's put visibility annotations on {{MetricsRecordBuilder}}.  Public/Evolving for agreement
with the {{MetricsRecordBuilder}} base class?
# Should {{initMultipartUploads}} increment the ignored error counter?
# {{rename}} has several {{catch}} blocks that don't propagate an exception.  Should these
increment the ignored error counter?
# In the following exception message, I think we need an extra space before the "to".  There
are 2 different call sites that produce this message, so 2 spots to fix.
      throw new InterruptedIOException("Interrupted copying " + src
          + "to "  + dst + ", cancelling");
# Can you please include {{src}} and {{dst}} in this log message from {{rename}}?
      LOG.debug("rename: src or dst are empty");
# Can you please include {{key}} in this log message from {{delete}}?
        LOG.debug("Deleting fake empty directory");
# Should {{S3AFileSystem#toString}} also include {{maxKeys}}, {{cannedACL}} and {{readAhead}}?
# In {{S3AInputStream#reopen}}, is the following log message redundant, considering the call
to {{closeStream}} will do its own logging?
      LOG.debug("Closing the previous stream");
      closeStream("reopen(" + reason + ")", requestedStreamLen);
# {{S3AInputStream#setReadahead}} doesn't exactly match the specification defined in {{CanSetReadahead}}.
 The interface says that {{null}} means to use the default, but the implementation here rejects
{{null}}.  This could be problematic for more complex use cases, such as someone wanting to
programmatically control the amount of readahead.  If they called {{setReadahead}} with a
custom value, then I think ideally we should allow them to call it with {{null}} later, and
restore back to the default from configuration.  (I admit this is an edge case, but a {{DFSInputStream}}
does allow this behavior.)
# {{S3AInstrumentation}} receives a {{Configuration}} in its constructor but doesn't use it.
 Can it be removed?
# {{S3AInstrumentation#gauge}} appears to be unused.
# {{InputStreamStatistics#toString}} does not include {{readFullyOperations}}.

It looks like there are some CheckStyle and JavaDoc things to follow up on from that last
pre-commit run.  The test failure is unrelated.

> add low level counter metrics for S3A; use in read performance tests
> --------------------------------------------------------------------
>                 Key: HADOOP-13028
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13028
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3, metrics
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, HADOOP-13028-004.patch,
HADOOP-13028-005.patch, HADOOP-13028-006.patch, HADOOP-13028-007.patch, HADOOP-13028-008.patch,
HADOOP-13028-branch-2-008.patch, org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt,
> against S3 (and other object stores), opening connections can be expensive, closing connections
may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of open/close/failure+reconnect
operations, timers of how long things take. This can be used downstream to measure efficiency
of the code (how often connections are being made), connection reliability, etc.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message