spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: No overwrite flag for saveAsXXFile
Date Fri, 06 Mar 2015 15:22:42 GMT
Found this thread:
http://search-hadoop.com/m/JW1q5HMrge2

Cheers

On Fri, Mar 6, 2015 at 6:42 AM, Sean Owen <sowen@cloudera.com> wrote:

> This was discussed in the past and viewed as dangerous to enable. The
> biggest problem, by far, comes when you have a job that output M
> partitions, 'overwriting' a directory of data containing N > M old
> partitions. You suddenly have a mix of new and old data.
>
> It doesn't match Hadoop's semantics either, which won't let you do
> this. You can of course simply remove the output directory.
>
> On Fri, Mar 6, 2015 at 2:20 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > Adding support for overwrite flag would make saveAsXXFile more user
> friendly.
> >
> > Cheers
> >
> >
> >
> >> On Mar 6, 2015, at 2:14 AM, Jeff Zhang <zjffdu@gmail.com> wrote:
> >>
> >> Hi folks,
> >>
> >> I found that RDD:saveXXFile has no overwrite flag which I think is very
> helpful. Is there any reason for this ?
> >>
> >>
> >>
> >> --
> >> Best Regards
> >>
> >> Jeff Zhang
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> > For additional commands, e-mail: user-help@spark.apache.org
> >
>

Mime
View raw message