lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Glen Newton <>
Subject Re: If you could have one feature in Lucene...
Date Sat, 27 Feb 2010 16:19:18 GMT
Hello Uwe.

That will teach me for not keeping up with the versions! :-)
So it is up to the application to keep track of what it used for compression.


On 27 February 2010 10:17, Uwe Schindler <> wrote:
> Hi Glen,
>> Pluggable compression allowing for alternatives to gzip for text
>> compression for storing.
>> Specifically I am interested in bzip2[1] as implemented in Apache
>> Commons Compress[2].
>> While bzip2 compression is considerable slower than gzip (although
>> decompression is not too much slower than gzip) it compresses much
>> better than gzip (especially text).
>> Having the choice would be helpful, and for Lucene usage for non-text
>> indexing, content specific compression algorithms may outperform the
>> default gzip.
> Since Version 3.0 / 2.9 of Lucene compression support was removed entirely (in 2.9 still
avail as deprecated). All you now have to do is simply store your compressed stored fields
as a byte[] (see Field javadocs). By that you can use any compression. The problems with gzip
and the other available compression algos lead us to removing the compression support from
Lucene (as it had lots of problems). In general the way to go is: Create a ByteArrayOutputStream
and wrap with any compression filter, then feed your data in and use "new Field(name,stream.getBytes())".
On the client side just use the inverse (Document.getBinaryValue(), create input stream on
top of byte[] and decompress).
> Uwe
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message