spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Holden Karau <hol...@pigscanfly.ca>
Subject Re: saveasSequenceFile with codec and compression type
Date Wed, 22 Oct 2014 16:50:25 GMT
Hi gpatcham,

If you want to save as a sequence file with a custom compression type you
can use saveAsHadoopFile along with setting the "
mapred.output.compression.type" on the jobconf. If you want to keep using
saveAsSequenceFile, and the syntax is much nicer, you could also set that
property on the SparkConf but then it would apply in general. Looking at
the SequenceFileOutputFormat.java it seems the default is RECORD so if that
fits for you, you can just use the default too :)

Cheers,

Holden

On Mon, Oct 20, 2014 at 2:41 PM, gpatcham <gpatcham@gmail.com> wrote:

> Hi All,
>
> I'm trying to save RDD as sequencefile and not able to use compresiontype
> (BLOCK or RECORD)
>
> Can any one let me know how we can use compressiontype
>
> here is the code I'm using
>
>
> RDD.saveAsSequenceFile(target,Some(classOf[org.apache.hadoop.io.compress.GzipCodec]))
>
> Thanks
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/saveasSequenceFile-with-codec-and-compression-type-tp16853.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>


-- 
Cell : 425-233-8271

Mime
View raw message