spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prannoy <pran...@sigmoidanalytics.com>
Subject Re: save spark streaming output to single file on hdfs
Date Thu, 15 Jan 2015 16:37:10 GMT
Hi,

You can use FileUtil.copyMerge API and specify the path to the folder where
saveAsTextFile is save the part text file.

Suppose your directory is /a/b/c/

use FileUtil.copyMerge(FileSystem of source, a/b/c, FileSystem of
destination, Path to the merged file say (a/b/c.txt), true(to delete the
original dir,null))

Thanks.

On Tue, Jan 13, 2015 at 11:34 PM, jamborta [via Apache Spark User List] <
ml-node+s1001560n21124h46@n3.nabble.com> wrote:

> Hi all,
>
> Is there a way to save dstream RDDs to a single file so that another
> process can pick it up as a single RDD?
> It seems that each slice is saved to a separate folder, using
> saveAsTextFiles method.
>
> I'm using spark 1.2 with pyspark
>
> thanks,
>
>
>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/save-spark-streaming-output-to-single-file-on-hdfs-tp21124.html
>  To start a new topic under Apache Spark User List, email
> ml-node+s1001560n1h33@n3.nabble.com
> To unsubscribe from Apache Spark User List, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=cHJhbm5veUBzaWdtb2lkYW5hbHl0aWNzLmNvbXwxfC0xNTI2NTg4NjQ2>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/save-spark-streaming-output-to-single-file-on-hdfs-tp21124p21167.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Mime
View raw message