hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajesh Balamohan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13560) S3ABlockOutputStream to support huge (many GB) file writes
Date Fri, 30 Sep 2016 05:51:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15535127#comment-15535127
] 

Rajesh Balamohan commented on HADOOP-13560:
-------------------------------------------

S3ABlockOutputStream::initiateMultiPartUpload() has the following 
{noformat}
LOG.debug("Initiating Multipart upload for block {}", currentBlock);
{noformat}

In S3ADataBlocks.java,  patch has the following for ByteArrayBlock
{noformat}
@Override
    public String toString() {
      return "ByteArrayBlock{" +
          "state=" + getState() +
          ", buffer=" + buffer +
          ", limit=" + limit +
          ", dataSize=" + dataSize +
          '}';
    }
{noformat}

When DEBUG log was enabled to check the AWS traffic, it ended up printing the entire contents
of the buffer. When trying to debug large data transfer (4 GB in my case), it ended up printing
huge chunks which may not be needed. Would it be possible to only the buffer sizes?.

> S3ABlockOutputStream to support huge (many GB) file writes
> ----------------------------------------------------------
>
>                 Key: HADOOP-13560
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13560
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.9.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>         Attachments: HADOOP-13560-branch-2-001.patch, HADOOP-13560-branch-2-002.patch,
HADOOP-13560-branch-2-003.patch, HADOOP-13560-branch-2-004.patch
>
>
> An AWS SDK [issue|https://github.com/aws/aws-sdk-java/issues/367] highlights that metadata
isn't copied on large copies.
> 1. Add a test to do that large copy/rname and verify that the copy really works
> 2. Verify that metadata makes it over.
> Verifying large file rename is important on its own, as it is needed for very large commit
operations for committers using rename



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message