spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Addanki, Santosh Kumar" <>
Subject saveAsParquetFile and DirectFileOutputCommitter Class not found Error
Date Sun, 07 Dec 2014 19:28:25 GMT

When we try to call saveAsParquetFile on a schemaRDD we get the following error :

Py4JJavaError: An error occurred while calling o384.saveAsParquetFile.
: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/lib/output/DirectFileOutputCommitter
        at org.apache.spark.sql.parquet.InsertIntoParquetTable.execute(ParquetTableOperations.scala:240)
        at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
        at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
        at org.apache.spark.sql.SchemaRDDLike$class.saveAsParquetFile(SchemaRDDLike.scala:76)
        at seems to have addressed this issue of respecting
the OutputCommitter but when I pull from the master and try the same I still encounter this

I am on a Mapr Distribution and my org\apache\hadoop\mapreduce\lib\output does not contain

Best Regards,

View raw message