kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Data encryption in Kudu
Date Tue, 25 Apr 2017 19:49:50 GMT
Agreed with what Dan said.

I think there are a number of interesting design alternatives to be
considered, so before coding it would be great to work through a design
document to explore the alternatives. For example, we could try to apply
encryption at the 'fs/' layer, which would cover all non-WAL data, but then
we would lose the ability to specify encryption on a per-column basis.
There are other requirements that need to be ironed out about whether we'd
need to support separate encryption keys per column/table/server/etc,
whether metadata also needs to be encrypted, etc.


On Tue, Apr 25, 2017 at 10:38 AM, Dan Burkert <danburkert@apache.org> wrote:

> Hi Franco,
> I think you are right that a client-based approach wouldn't work, because
> we wouldn't want to encrypt at the level of individual cell values.  That
> would get in the way of encoding, compression, predicate evaluation, etc.
> As you note, adding encryption at the block layer is probably the way to
> go.  Key management is definitely the tricky issue.  We do have one
> advantage over HDFS - because Kudu does logical replication, the encryption
> key can be scoped to a particular tablet server or tablet replica, it
> wouldn't need to be shared among all replicas.  I haven't done enough
> research to know if this makes it fundamentally easier to do key
> management.  I would assume at a minimum we would want to integrate with
> key providers such an HSM.  It would be good to have a thorough review of
> existing solutions in the space, such as TDE
> <https://en.wikipedia.org/wiki/Transparent_Data_Encryption> and the
> Hadoop KMS.  Is this something you are interested in working on?
> - Dan
> On Tue, Apr 25, 2017 at 8:30 AM, David Alves <davidralves@gmail.com>
> wrote:
>> Hi Franco
>>   Dan, Alexey, Todd are our security experts.
>>   Folks, thoughts on this?
>> Best
>> David
>> On Mon, Apr 24, 2017 at 7:08 PM, <fventuri@comcast.net> wrote:
>>> Over the weekend I started looking at what it would take to add data
>>> encryption to Kudu (besides using filesystem encryption via dm-crypt or
>>> something like that).
>>> Here are a few notes - please feel free to comment on them and add
>>> suggestions:
>>> - reading through this mailing list, it looks like this feature has been
>>> asked a couple of times but last year, but from what I can tell, noone is
>>> currently working on it.
>>> - a client-based approach to encryption like the one used by HDFS
>>> wouldn't work (at least out of the box) because for instance encrypting the
>>> primary key at the client would prevent being able to have range filters
>>> for scans; it might work for the columns that are not part of the primary
>>> key
>>> - there's already code in Kudu for several compression codecs (LZ4,
>>> gzip, etc); I thought it would be possible to add similar code for
>>> encryption codecs (to be applied after the compression, of course)
>>> - the WAL log files and delta files should be similarly encrypted too
>>> - not sure what would be the best way to manage the key - I see that in
>>> HDFS they use a double key mechanism, where the encryption key for the data
>>> file is itself encrypted with the allowed user key and this whole process
>>> is managed by an external Key Management Service
>>> Thanks in advance for your ideas and suggestions,
>>> Franco

Todd Lipcon
Software Engineer, Cloudera

View raw message