hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicholas Carlini (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-6837) Support for LZMA compression
Date Mon, 26 Jul 2010 19:26:22 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Nicholas Carlini updated HADOOP-6837:

    Attachment: hadoop-6349-2.patch

Attached an update patch.

Fixed the checksum mismatch. It was possible for the decompressor to run out of input after
reading the header bytes but not notice if the block ID was 1. So if there were fewer than
26 bytes in the input (but more than 16) and the byte ID was 1 then it wouldn't notice and
just use whatever happened to be in the buffer at the time.

Fixed a bug in the decompressor where it would incorrectly indicate it was finished if at
the end of decompressing a block there was no more input left to decompress and decompress()
was then called again (TestCodec seed 1333275328, 2011623221, -1402938700 or -1990280158;
generate 50,000 records). Actually, the decompressor never returns finished now. This is because
the only time the decompressor should return true is if it somehow knows the end of the stream
has been reached and it doesn't, it just guesses that if it has read all the bytes it currently
has then it's done, which is not the case.

Implemented getRemaining().

Removed iOff from both the compressor and decompressor. It was initialized to zero from the
start and was only ever modified after that by setting it to 0.

Modified TestCodec to accept a seed as an argument.

Removed the rest of the carriage returns.

I will be adding a native version over the next few days and will upload that patch when it's

> Support for LZMA compression
> ----------------------------
>                 Key: HADOOP-6837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6837
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>            Reporter: Nicholas Carlini
>            Assignee: Nicholas Carlini
>         Attachments: hadoop-6349-2.patch, HADOOP-6837-lzma-1-20100722.patch, HADOOP-6837-lzma-c-20100719.patch,
> Add support for LZMA (http://www.7-zip.org/sdk.html) compression, which generally achieves
higher compression ratios than both gzip and bzip2.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message