hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
Date Mon, 12 May 2014 17:42:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995305#comment-13995305

Alejandro Abdelnur commented on HADOOP-10150:

[cross-posting with HDFS-6134]

Reopening HDFS-6134

After some offline discussions with Yi, Tianyou, ATM, Todd, Andrew and Charles we think is
makes more sense to implement encryption for HDFS directly into the DistributedFileSystem
client and to use CryptoFileSystem support encryption for FileSystems that don’t support
native encryption.

The reasons for this change of course are:

* If we want to we add support for HDFS transparent compression, the compression should be
done before the encryption (implying less entropy). If compression is to be handled by HDFS
DistributedFileSystem, then the encryption has to be handled afterwards (in the write path).

* The proposed CryptoSupport abstraction significantly complicates the implementation of CryptoFileSystem
and the wiring in HDFS FileSystem client.

* Building it directly into HDFS FileSystem client may allow us to avoid an extra copy of

Because of this, the idea is now:

* A common set of Crypto Input/Output streams. They would be used by CryptoFileSystem, HDFS
encryption, MapReduce intermediate data and spills. Note we cannot use the JDK Cipher Input/Output
streams directly because we need to support the additional interfaces that the Hadoop FileSystem
streams implement (Seekable, PositionedReadable,  ByteBufferReadable, HasFileDescriptor, CanSetDropBehind,
CanSetReadahead, HasEnhancedByteBufferAccess,  Syncable, CanSetDropBehind).

* CryptoFileSystem.
 To support encryption in arbitrary FileSystems.

* HDFS client encryption. To support transparent HDFS encryption.

Both CryptoFilesystem and HDFS client encryption implementations would be built using the
Crypto Input/Output streams, xAttributes and KeyProvider API.

> Hadoop cryptographic file system
> --------------------------------
>                 Key: HADOOP-10150
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10150
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: security
>    Affects Versions: 3.0.0
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>              Labels: rhino
>             Fix For: 3.0.0
>         Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file system-V2.docx,
HADOOP cryptographic file system.pdf, HDFSDataAtRestEncryptionAlternatives.pdf, HDFSDataatRestEncryptionAttackVectors.pdf,
HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based on INode feature.patch
> There is an increasing need for securing data when Hadoop customers use various upper
layer applications, such as Map-Reduce, Hive, Pig, HBase and so on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based on HADOOP
“FilterFileSystem” decorating DFS or other file systems, and transparent to upper layer
applications. It’s configurable, scalable and fast.
> High level requirements:
> 1.	Transparent to and no modification required for upper layer applications.
> 2.	“Seek”, “PositionedReadable” are supported for input stream of CFS if the
wrapped file system supports them.
> 3.	Very high performance for encryption and decryption, they will not become bottleneck.
> 4.	Can decorate HDFS and all other file systems in Hadoop, and will not modify existing
structure of file system, such as namenode and datanode structure if the wrapped file system
is HDFS.
> 5.	Admin can configure encryption policies, such as which directory will be encrypted.
> 6.	A robust key management framework.
> 7.	Support Pread and append operations if the wrapped file system supports them.

This message was sent by Atlassian JIRA

View raw message