hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK
Date Tue, 21 Nov 2017 21:33:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261515#comment-16261515

Steve Loughran commented on HADOOP-13887:

FWIW, presto have this, and they get to see the prestofs issues

* https://github.com/prestodb/presto/issues/7186 : Presto doesn't seem to be able to read
encrypted Parquet data
* https://github.com/aws/aws-sdk-java/issues/1057 : EMRFS doesn't set the x-amz-unencrypted-content-length

Presto does look for the header, just gets burned with EMRFS saved data which doesn't set
the header. What does EMR do? From the issues

bq. We had a chat with the EMR people to understand how Hive/Spark is able to read encrypted
files when the x-amz-unencrypted-content-length is not set. The outcome is, EMR Hive/Spark
reads the entire file in those cases to determine the unencrypted content length, which is
something that we don't really want to do.

> Encrypt S3A data client-side with AWS SDK
> -----------------------------------------
>                 Key: HADOOP-13887
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13887
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Jeeyoung Kim
>            Assignee: Igor Mazur
>            Priority: Minor
>         Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, HADOOP-13887-branch-2-003.patch,
HADOOP-13897-branch-2-004.patch, HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch,
HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, HADOOP-13897-branch-2-010.patch,
HADOOP-13897-branch-2-012.patch, HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch,
HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf
> Expose the client-side encryption option documented in Amazon S3 documentation  - http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html
> Currently this is not exposed in Hadoop but it is exposed as an option in AWS Java SDK,
which Hadoop currently includes. It should be trivial to propagate this as a parameter passed
to the S3client used in S3AFileSystem.java

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message