hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
Date Fri, 06 May 2016 19:53:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274641#comment-15274641

Colin Patrick McCabe commented on HADOOP-13028:

926 <property>
927	  <name>fs.s3a.readahead.range</name>
928	  <value>65536</value>
929	  <description>Bytes to read ahead during a seek() before closing and
930	  re-opening the S3 HTTP connection.</description>
931	</property>
Hmm, should this be {{fs.s3a.readahead.default}}?  It seems like this is the default if the
user doesn't call {{FSDataInputStream#setReadahead}},

{{S3AInputStream#closed}}: it seems like this should be an {{AtomicBoolean}}.  Otherwise two
threads could both enter this code block, right?
362	    if (!closed) {
363	      closed = true;
364	      super.close();
365	      closeStream("close() operation", this.contentLength);
366	      streamStatistics.close();
367	    }

  public S3AInstrumentation.InputStreamStatistics getStreamStatistics() {
Maybe should be called {{getS3StreamStatistics}}, reflecting the fact that this API is s3-specific?

Is it really necessary to put statistics information into the {{toString}} methods of the
streams?  It seems like this could lead to compatibility woes, and we have the API described
above to provide this information anyway.

> add low level counter metrics for S3A; use in read performance tests
> --------------------------------------------------------------------
>                 Key: HADOOP-13028
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13028
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3, metrics
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, HADOOP-13028-004.patch,
HADOOP-13028-005.patch, HADOOP-13028-006.patch, HADOOP-13028-007.patch, HADOOP-13028-008.patch,
HADOOP-13028-009.patch, HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch,
HADOOP-13028-branch-2-010.patch, org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt,
> against S3 (and other object stores), opening connections can be expensive, closing connections
may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of open/close/failure+reconnect
operations, timers of how long things take. This can be used downstream to measure efficiency
of the code (how often connections are being made), connection reliability, etc.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message