whirr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Castagna <castagna.li...@googlemail.com>
Subject Re: WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform...
Date Fri, 07 Oct 2011 14:19:26 GMT
Hi Andrei,
probably you are right, there is something you need to do when you
install an Hadoop cluster to support compression:
http://hadoop.apache.org/common/docs/current/native_libraries.html

However, it would be nice if Whirr could do that and make compression
available to developers.

I am now going to try to use the DistributedCache to distribute the
necessary shared libraries:
http://hadoop.apache.org/common/docs/current/native_libraries.html#Native+Shared+Libraries

If others are using compression with cluster created via Apache Whirr
I am interested to know what they are doing.

Thanks,
Paolo

On 7 October 2011 15:09, Andrei Savu <savu.andrei@gmail.com> wrote:
> Ask on the Hadoop email list if the official release supports compression
> out of the box.
>
> On Oct 7, 2011 5:00 PM, "Paolo Castagna" <castagna.lists@googlemail.com>
> wrote:
>>
>> Hi Andrei,
>> thank you for your reply.
>>
>> As I wrote in my previous message, I have this in my hadoop-ec2.properties
>> file:
>> whirr.hadoop.version=0.20.204.0
>>
>> whirr.hadoop.tarball.url=http://archive.apache.org/dist/hadoop/core/hadoop-${whirr.hadoop.version}/hadoop-${whirr.hadoop.version}.tar.gz
>>
>> Therefore I am using Apache Hadoop release 0.20.204.0.
>>
>> Can anyone confirm that you cannot use compression with Hadoop
>> clusters installed via Whirr?
>>
>> Paolo
>>
>> On 7 October 2011 14:23, Andrei Savu <savu.andrei@gmail.com> wrote:
>> > I am not sure but I think you need a custom Hadoop build to support
>> > compression. Are you using the Apache release or CDH?
>> >
>> > On Oct 7, 2011 3:57 PM, "Paolo Castagna" <castagna.lists@googlemail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >> I am using Apache Whirr 0.6.0-incubating to create small (i.e. 10-20
>> >> nodes)
>> >> Hadoop clusters on Amazon EC2 for testing.
>> >>
>> >> In my hadoop-ec2.properties I have:
>> >>
>> >> whirr.hardware-id=m1.large
>> >> # See http://alestic.com/
>> >> whirr.image-id=eu-west-1/ami-8293a5f6
>> >> whirr.location-id=eu-west-1
>> >> whirr.hadoop.version=0.20.204.0
>> >>
>> >>
>> >> whirr.hadoop.tarball.url=http://archive.apache.org/dist/hadoop/core/hadoop-${whirr.hadoop.version}/hadoop-${whirr.hadoop.version}.tar.gz
>> >>
>> >> I am able to successfully run MapReduce jobs but I see errors as soon
>> >> as I
>> >> try to enable compression (either for the map output or for the
>> >> SequenceFile
>> >> at the end of a job).
>> >>
>> >> In the task logs I see this warning, which I think is relevant:
>> >> WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load
>> >> native-hadoop library for your platform... using builtin-java classes
>> >> where applicable
>> >>
>> >> This is what I have in my "driver(s)", for the intermediate map output:
>> >>
>> >>     if ( useCompression ) {
>> >>         configuration.setBoolean("mapred.compress.map.output", true);
>> >>         configuration.set("mapred.output.compression.type", "BLOCK");
>> >>         configuration.set("mapred.map.output.compression.codec",
>> >> "org.apache.hadoop.io.compress.GzipCodec");
>> >>     }
>> >>
>> >> For the final job output:
>> >>
>> >>     if ( useCompression ) {
>> >>         SequenceFileOutputFormat.setCompressOutput(job, true);
>> >>         SequenceFileOutputFormat.setOutputCompressorClass(job,
>> >> GzipCodec.class);
>> >>         SequenceFileOutputFormat.setOutputCompressionType(job,
>> >> CompressionType.BLOCK);
>> >>     }
>> >>
>> >> As you can see I am trying to use a simple GzipCodec, nothing strange.
>> >>
>> >> What am I doing wrong?
>> >> What should I do in order to be able to use compression in my MapReduce
>> >> jobs?
>> >>
>> >> Thank you in advance for your help,
>> >> Paolo
>> >
>

Mime
View raw message