spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fawze Abujaber <fawz...@gmail.com>
Subject native-lzo library not available
Date Thu, 03 May 2018 12:06:15 GMT
Hi Guys,

I'm running into issue where my spark jobs are failing on the below error,
I'm using Spark 1.6.0 with CDH 5.13.0.

I tried to figure it out with no success.

Will appreciate any help or a direction how to attack this issue.

User class threw exception: org.apache.spark.SparkException: Job aborted
due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent
failure: Lost task 0.3 in stage 1.0 (TID 3, xxxxxx, executor 1):
*java.lang.RuntimeException:
native-lzo library not available*
*at
com.hadoop.compression.lzo.LzoCodec.getDecompressorType(LzoCodec.java:193)*
at
org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:181)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1995)
at
org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1881)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1830)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1844)
at
org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
at
com.liveperson.dallas.lp.utils.incremental.DallasGenericTextFileRecordReader.initialize(DallasGenericTextFileRecordReader.java:64)
at
com.liveperson.hadoop.fs.inputs.LPCombineFileRecordReaderWrapper.initialize(LPCombineFileRecordReaderWrapper.java:38)
at
org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initialize(CombineFileRecordReader.java:63)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:168)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:133)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:



I see the LZO at GPextras:

ll
total 104
-rw-r--r-- 1 cloudera-scm cloudera-scm 35308 Oct  4  2017 COPYING.hadoop-lzo
-rw-r--r-- 1 cloudera-scm cloudera-scm 62268 Oct  4  2017
hadoop-lzo-0.4.15-cdh5.13.0.jar
lrwxrwxrwx 1 cloudera-scm cloudera-scm    31 May  3 07:23 hadoop-lzo.jar ->
hadoop-lzo-0.4.15-cdh5.13.0.jar
drwxr-xr-x 2 cloudera-scm cloudera-scm  4096 Oct  4  2017 native




-- 
Take Care
Fawze Abujaber

Mime
View raw message