spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: PySpark saveAsTextFile gzip
Date Fri, 16 Jan 2015 06:51:34 GMT
You can use the saveAsNewAPIHadoop
<http://spark.apache.org/docs/1.1.0/api/python/pyspark.rdd.RDD-class.html#saveAsNewAPIHadoopFile>
file. You can use it for compressing your output, here's a sample code
<https://github.com/ScrapCodes/spark-1/blob/master/python/pyspark/tests.py#L1225>
to use the API.

Thanks
Best Regards

On Thu, Jan 15, 2015 at 5:16 PM, Tom Seddon <mr.tom.seddon@gmail.com> wrote:

> Hi,
>
> I've searched but can't seem to find a PySpark example.  How do I write
> compressed text file output to S3 using PySpark saveAsTextFile?
>
> Thanks,
>
> Tom
>

Mime
View raw message