spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivoirians <kvns...@gmail.com>
Subject Empty RDD after LzoTextInputFormat in newAPIHadoopFile
Date Tue, 29 Jul 2014 18:05:47 GMT
Hello,

There seems to be very little documentation on the usage of newAPIHadoopFile
and even less of it in conjunction with opening LZO compressed files. I've
hit a wall with some unexpected behavior that I don't know how to interpret.

This is a test program I'm running in an effort to get this working, after
finding previous threads on this subject.



The job runs on a yarn cluster and input is the path of a very much
non-empty LZO file sitting in hdfs, which I can manually decompress and read
as a textfile, with a count of ~3 million. What I don't know how to
interpret is that the above code runs without complaints and prints 0. I
would appreciate some guidance with where to go; there are no error messages
to point me anywhere, just an empty RDD.

Thanks,
Kevin



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Empty-RDD-after-LzoTextInputFormat-in-newAPIHadoopFile-tp10873.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message