spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davies Liu <dav...@databricks.com>
Subject Re: save spark streaming output to single file on hdfs
Date Tue, 13 Jan 2015 18:15:15 GMT
On Tue, Jan 13, 2015 at 10:04 AM, jamborta <jamborta@gmail.com> wrote:
> Hi all,
>
> Is there a way to save dstream RDDs to a single file so that another process
> can pick it up as a single RDD?

It does not need to a single file, Spark can pick any directory as a single RDD.

Also, it's easy to union multiple RDDs into single one.

> It seems that each slice is saved to a separate folder, using
> saveAsTextFiles method.
>
> I'm using spark 1.2 with pyspark
>
> thanks,
>
>
>
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/save-spark-streaming-output-to-single-file-on-hdfs-tp21124.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message