spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ayan guha <guha.a...@gmail.com>
Subject Re: native-lzo library not available
Date Fri, 04 May 2018 05:48:01 GMT
This seems to be a Cloudera environment issue, and you might get a faster
and more reliable answer in Cloudera forums.

On Fri, May 4, 2018 at 3:39 PM, Fawze Abujaber <fawzeaj@gmail.com> wrote:

> Hi Yulia,
>
> Thanks for you response.
>
> i see only lzo only for impala
>
>  [root@xxxxxxx ~]# locate *lzo*.so*
> /opt/cloudera/parcels/GPLEXTRAS-5.13.0-1.cdh5.13.0.p0.29/lib/impala/lib/
> libimpalalzo.so
> /usr/lib64/liblzo2.so.2
> /usr/lib64/liblzo2.so.2.0.0
>
> the /opt/cloudera/parcels/GPLEXTRAS-5.13.0-1.cdh5.13.0.p0.29/lib/hadoop/lib/native
> has :
>
> -rwxr-xr-x 1 cloudera-scm cloudera-scm 22918 Oct  4  2017
> libgplcompression.a
> -rwxr-xr-x 1 cloudera-scm cloudera-scm  1204 Oct  4  2017
> libgplcompression.la
> -rwxr-xr-x 1 cloudera-scm cloudera-scm  1205 Oct  4  2017
> libgplcompression.lai
> -rwxr-xr-x 1 cloudera-scm cloudera-scm 15760 Oct  4  2017
> libgplcompression.so
> -rwxr-xr-x 1 cloudera-scm cloudera-scm 15768 Oct  4  2017
> libgplcompression.so.0
> -rwxr-xr-x 1 cloudera-scm cloudera-scm 15768 Oct  4  2017
> libgplcompression.so.0.0.0
>
>
> and /opt/cloudera/parcels/GPLEXTRAS-5.13.0-1.cdh5.13.0.p0.29/lib/spark-netlib/lib
> has:
>
> -rw-r--r-- 1 cloudera-scm cloudera-scm    8673 Oct  4  2017
> jniloader-1.1.jar
> -rw-r--r-- 1 cloudera-scm cloudera-scm   53249 Oct  4  2017
> native_ref-java-1.1.jar
> -rw-r--r-- 1 cloudera-scm cloudera-scm   53295 Oct  4  2017
> native_system-java-1.1.jar
> -rw-r--r-- 1 cloudera-scm cloudera-scm 1732268 Oct  4  2017
> netlib-native_ref-linux-x86_64-1.1-natives.jar
> -rw-r--r-- 1 cloudera-scm cloudera-scm  446694 Oct  4  2017
> netlib-native_system-linux-x86_64-1.1-natives.jar
>
>
> Note: The issue occuring only with the spark job, mapreduce job working
> fine.
>
> On Thu, May 3, 2018 at 9:17 PM, yuliya Feldman <yufeldman@yahoo.com>
> wrote:
>
>> Jar is not enough, you need native library (*.so) - see if your "native"
>> directory contains it
>>
>> drwxr-xr-x 2 cloudera-scm cloudera-scm  4096 Oct  4  2017 native
>>
>> and whether  java.library.path or LD_LIBRARY_PATH points/includes
>> directory where your *.so library resides
>>
>> On Thursday, May 3, 2018, 5:06:35 AM PDT, Fawze Abujaber <
>> fawzeaj@gmail.com> wrote:
>>
>>
>> Hi Guys,
>>
>> I'm running into issue where my spark jobs are failing on the below
>> error, I'm using Spark 1.6.0 with CDH 5.13.0.
>>
>> I tried to figure it out with no success.
>>
>> Will appreciate any help or a direction how to attack this issue.
>>
>> User class threw exception: org.apache.spark.SparkException: Job aborted
>> due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent
>> failure: Lost task 0.3 in stage 1.0 (TID 3, xxxxxx, executor 1): *java.lang.RuntimeException:
>> native-lzo library not available*
>> *at
>> com.hadoop.compression.lzo.LzoCodec.getDecompressorType(LzoCodec.java:193)*
>> at org.apache.hadoop.io.compress.CodecPool.getDecompressor(Code
>> cPool.java:181)
>> at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1995)
>> at org.apache.hadoop.io.SequenceFile$Reader.initialize(
>> SequenceFile.java:1881)
>> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile
>> .java:1830)
>> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile
>> .java:1844)
>> at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordRead
>> er.initialize(SequenceFileRecordReader.java:54)
>> at com.liveperson.dallas.lp.utils.incremental.DallasGenericText
>> FileRecordReader.initialize(DallasGenericTextFileRecordReader.java:64)
>> at com.liveperson.hadoop.fs.inputs.LPCombineFileRecordReaderWra
>> pper.initialize(LPCombineFileRecordReaderWrapper.java:38)
>> at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReade
>> r.initialize(CombineFileRecordReader.java:63)
>> at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRD
>> D.scala:168)
>> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:133)
>> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsR
>> DD.scala:38)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsR
>> DD.scala:38)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsR
>> DD.scala:38)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMap
>> Task.scala:73)
>> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMap
>> Task.scala:41)
>> at org.apache.spark.scheduler.Task.run(Task.scala:89)
>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1145)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)
>> Driver stacktrace:
>>
>>
>>
>> I see the LZO at GPextras:
>>
>> ll
>> total 104
>> -rw-r--r-- 1 cloudera-scm cloudera-scm 35308 Oct  4  2017
>> COPYING.hadoop-lzo
>> -rw-r--r-- 1 cloudera-scm cloudera-scm 62268 Oct  4  2017
>> hadoop-lzo-0.4.15-cdh5.13.0.jar
>> lrwxrwxrwx 1 cloudera-scm cloudera-scm    31 May  3 07:23 hadoop-lzo.jar
>> -> hadoop-lzo-0.4.15-cdh5.13.0.jar
>> drwxr-xr-x 2 cloudera-scm cloudera-scm  4096 Oct  4  2017 native
>>
>>
>>
>>
>> --
>> Take Care
>> Fawze Abujaber
>>
>
>
>
> --
> Take Care
> Fawze Abujaber
>



-- 
Best Regards,
Ayan Guha

Mime
View raw message