spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: Spark 1.0: Unable to Read LZO Compressed File
Date Wed, 02 Jul 2014 03:55:09 GMT
I’d suggest asking the IBM Hadoop folks, but my guess is that the library cannot be found
in /opt/IHC/lib/native/Linux-amd64-64/. Or maybe if this exception is happening in your driver
program, the driver program’s java.library.path doesn’t include this. (SPARK_LIBRARY_PATH
from spark-env.sh only applies to stuff launched on the clusters).

Matei

On Jul 1, 2014, at 7:15 AM, Uddin, Nasir M. <nuddin@dtcc.com> wrote:

> Dear Spark Users:
>  
> Spark 1.0 has been installed as Standalone – But it can’t read any compressed (CMX/Snappy)
and Sequence file residing on HDFS (it can read uncompressed files from HDFS). The key notable
message is: “Unable to load native-hadoop library…..”. Other related messages are –
>  
> Caused by: java.lang.IllegalStateException: Cannot load com.ibm.biginsights.compress.CmxDecompressor
without native library! at com.ibm.biginsights.compress.CmxDecompressor.<clinit>(CmxDecompressor.java:65)
>  
> Here is the core-site.xml’s key part:
> <name>io.compression.codecs</name>
> <value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec,com.ibm.biginsights.compress.CmxCodec</value>
>   </property>
>  
> Here is the spark.env.sh:
> export SPARK_WORKER_CORES=4
> export SPARK_WORKER_MEMORY=10g
> export SCALA_HOME=/opt/spark/scala-2.11.1
> export JAVA_HOME=/opt/spark/jdk1.7.0_55
> export SPARK_HOME=/opt/spark/spark-0.9.1-bin-hadoop2
> export ADD_JARS=/opt/IHC/lib/compression.jar
> export SPARK_CLASSPATH=/opt/IHC/lib/compression.jar
> export SPARK_LIBRARY_PATH=/opt/IHC/lib/native/Linux-amd64-64/
> export SPARK_MASTER_WEBUI_PORT=1080
> export HADOOP_CONF_DIR=/opt/IHC/hadoop-conf
>  
> Note: core-site.xml and hdfs-site.xml are in hadoop-conf. CMX is an IBM branded splittable
LZO based compression codec.
>  
> Any help to resolve the issue is appreciated.
>  
> Thanks,
> Nasir
> 
> DTCC DISCLAIMER: This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed. If you have received
this email in error, please notify us immediately and delete the email and any attachments
from your system. The recipient should check this email and any attachments for the presence
of viruses.  The company accepts no liability for any damage caused by any virus transmitted
by this email.


Mime
View raw message