spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "DEVAN M.S." <msdeva...@gmail.com>
Subject Re: reducing number of output files
Date Fri, 23 Jan 2015 01:32:34 GMT
Rdd.coalesce(1) will coalesce RDD and give only one output file.
coalesce(2) will give 2 wise versa.
On Jan 23, 2015 4:58 AM, "Sean Owen" <sowen@cloudera.com> wrote:

> One output file is produced per partition. If you want fewer, use
> coalesce() before saving the RDD.
>
> On Thu, Jan 22, 2015 at 10:46 PM, Kane Kim <kane.isturm@gmail.com> wrote:
> > How I can reduce number of output files? Is there a parameter to
> saveAsTextFile?
> >
> > Thanks.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> > For additional commands, e-mail: user-help@spark.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message