hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tatu Saloranta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-6389) Add support for LZF compression
Date Mon, 01 Aug 2011 03:15:37 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073428#comment-13073428

Tatu Saloranta commented on HADOOP-6389:

Lzf4hadoop project at github -- https://github.com/ning/lzf4hadoop -- now provides necessary
I hope to get more testing done to ensure interaction with hadoop abstractions work as intended;
assuming things go well, this could serve as the implementation to use. Or, if separate project
& maven-accessible artifacts are enough, maybe just add a link from documentation.

As to performance, see https://github.com/ning/jvm-compressor-benchmark .
LZF is the fastest pure java compressor tested; of all included codecs Snappy (which uses
JNI to use C impl of snappy codec) is faster for decompression, and about as fast for compression.

Compression rates between basic lempel-ziv implementations (quiclz, lzo, snappy, lzf) are
comparable; and all are significantly faster than basic deflate (but with lower compression

> Add support for LZF compression
> -------------------------------
>                 Key: HADOOP-6389
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6389
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: io
>            Reporter: Tatu Saloranta
> (note: related to [HADOOP-4874])
> As per Doug's earlier comments, LZF does indeed look like a good compressor candidate
for fast compression/decompression, good enough compression rate.
> From my testing it seems at least twice as fast at compression, and somewhat faster for
decompressing than gzip.
> Code from [http://h2database.googlecode.com/svn/trunk/h2/src/main/org/h2/compress/] is
applicable, and I have tested it with json data.
> I hope to have more to spend on this in near future, but if someone else gets to this
first that'd be good too.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message