lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Klaas" <>
Subject Re: Re: Attached proposed modifications to Lucene 2.0 to support Field.Store.Encrypted
Date Fri, 01 Dec 2006 20:54:27 GMT
On 12/1/06, negrinv <> wrote:
> I think we should not make too many assumptions about performance until we
> can test alternative solutions.


> The small payload overhead will be amply offset in my opinion by the ability
> to be very selective about what is being encrypted, as opposed to wholesale
> encryption and decryption.

Here I disagree.  There is no point in providing encryption unless the
entire scheme is cryptographically secure.  Such determination
requires thorough knowledge about what types of information exist in
lucene and how it is all related.  If lucene is to provide encryption,
it should be in the form of a scheme in which the whole system is
secure.  Otherwise, what is the point?  Also, if users only want to
encrypt stored fields, that is easier done on client-side.

Selectivity might actually hurt performance, actually, as a system in
which everything is encrypted can work with whole blocks at a time and
have fancy caching schemes in place.  But at that point, it is looking
quite similar to using lucene on an encrypted filesystem.

> Also we should look at performance in the larger
> context of all the possible reasons why users might need encryption. A large
> proportion may not be worried about performance at all.

That may be, but Lucene users are generally quite sensitive to
performance factors.  What makes you think this will not be the case
for consumers of the encryption api?

> And in final
> analysis any performance degradation is not going to be crippling, we are
> probably talking about very small percentages, either way, which, as long as
> they are known and made available, will enable users to make an informed
> decision.

I'm not sure on what you base the performance degradation being on the
order of small percentages (see your point above about making
assumptions),  I certainly don't know for certain, but I can easily
imagine encryption of query-related data (positions, term lists, etc)
having a huge impact on performance.  In any case, there is a
benchmark suite for lucene which can be used to measure the


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message