spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tamas Jambor <jambo...@gmail.com>
Subject Re: save spark streaming output to single file on hdfs
Date Tue, 13 Jan 2015 18:35:28 GMT
Thanks. The problem is that we'd like it to be picked up by hive.

On Tue Jan 13 2015 at 18:15:15 Davies Liu <davies@databricks.com> wrote:

> On Tue, Jan 13, 2015 at 10:04 AM, jamborta <jamborta@gmail.com> wrote:
> > Hi all,
> >
> > Is there a way to save dstream RDDs to a single file so that another
> process
> > can pick it up as a single RDD?
>
> It does not need to a single file, Spark can pick any directory as a
> single RDD.
>
> Also, it's easy to union multiple RDDs into single one.
>
> > It seems that each slice is saved to a separate folder, using
> > saveAsTextFiles method.
> >
> > I'm using spark 1.2 with pyspark
> >
> > thanks,
> >
> >
> >
> >
> >
> >
> >
> > --
> > View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/save-spark-streaming-output-to-single-
> file-on-hdfs-tp21124.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> > For additional commands, e-mail: user-help@spark.apache.org
> >
>

Mime
View raw message