hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiaoyu Yao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11708) CryptoOutputStream synchronization differences from DFSOutputStream break HBase
Date Thu, 12 Mar 2015 17:42:38 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359044#comment-14359044

Xiaoyu Yao commented on HADOOP-11708:

Agree with [~steve_l@iseran.com] on the risk assessment of changing DFSOutputStream. 
+1 for fixing CryptoOutputStream to implement the same expectations of HDFS. 

As an alternative before the CryptoOutputStream is fixed, users can use HBase native encryption
at-rest introduced by [HBASE-7544|https://issues.apache.org/jira/browse/HBASE-7544] to encypt
HBase HFile/WAL files and persist them with normal HDFS DFSOutputStream.

> CryptoOutputStream synchronization differences from DFSOutputStream break HBase
> -------------------------------------------------------------------------------
>                 Key: HADOOP-11708
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11708
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.6.0
>            Reporter: Sean Busbey
>            Assignee: Sean Busbey
>            Priority: Critical
> For the write-ahead-log, HBase writes to DFS from a single thread and sends sync/flush/hflush
from a configurable number of other threads (default 5).
> FSDataOutputStream does not document anything about being thread safe, and it is not
thread safe for concurrent writes.
> However, DFSOutputStream is thread safe for concurrent writes + syncs. When it is the
stream FSDataOutputStream wraps, the combination is threadsafe for 1 writer and multiple syncs
(the exact behavior HBase relies on).
> When HDFS Transparent Encryption is turned on, CryptoOutputStream is inserted between
FSDataOutputStream and DFSOutputStream. It is proactively labeled as not thread safe, and
this composition is not thread safe for any operations.

This message was sent by Atlassian JIRA

View raw message