hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-9802) Support Snappy codec on Windows.
Date Wed, 31 Jul 2013 21:53:48 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-9802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Chris Nauroth updated HADOOP-9802:

    Attachment: HADOOP-9802-branch-1-win.1.patch

This work started on branch-1-win, so I'm attaching the patch for that.  I'll provide a trunk
patch soon too.  Here is a summary of the changes:
# Update the runtime library path used in hadoop.cmd so that snappy.dll can be loaded from
lib/native if the build bundled snappy into the distro.
# build.xml changes to call javah on Windows.
# Visual Studio project file changes to compile the C code.
# Windows-specific dynamic library loading code.
# Minor changes to C code to guarantee correct calling convention and move a few variable
declarations to the top of the function, because MSVC doesn't support C99.

Assuming you have Snappy itself deployed to C:\snappy, here is the easiest way to test it:

ant clean test-core -Dwindows=true -Dsnappy.prefix=C:\snappy -Dtestcase=TestCodec

I also successfully tested creating a distro with snappy bundled:

ant clean tar -Dwindows=true -Dforrest.home=C:\apache-forrest-0.9 -Dbundle.snappy=true -Dsnappy.prefix=C:\snappy

Then, I used that distro to test running a wordcount MR job that compresses its output:

hadoop-1.3.0-SNAPSHOT\bin\hadoop.cmd jar hadoop-1.3.0-SNAPSHOT\hadoop-examples-1.3.0-SNAPSHOT.jar
wordcount -D mapred.output.compress=true -D mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec
/input /output

Then, I ran a grep MR job using the snappy-compressed file as input to verify that the codec
could decompress successfully:

hadoop-1.3.0-SNAPSHOT\bin\hadoop.cmd jar hadoop-1.3.0-SNAPSHOT\hadoop-examples-1.3.0-SNAPSHOT.jar
grep /output/part* /grepout Apache

(My input file was our LICENSE.txt file, which is why I grepped for "Apache" in my test.)

Big thanks to [~chuanliu] who started a lot of this work.

> Support Snappy codec on Windows.
> --------------------------------
>                 Key: HADOOP-9802
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9802
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 3.0.0, 1-win, 2.1.1-beta
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-9802-branch-1-win.1.patch
> Build and test the existing Snappy codec on Windows.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message