spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gourav Sengupta <gourav.sengu...@gmail.com>
Subject Re: [pyspark 2.3+] Bucketing with sort - incremental data load?
Date Fri, 31 May 2019 06:00:14 GMT
Hi Rishi,

I think that if you are using sorting and then appending data locally there
will no need to bucket data and you are good with external tables that way.

Regards,
Gourav

On Fri, May 31, 2019 at 3:43 AM Rishi Shah <rishishah.star@gmail.com> wrote:

> Hi All,
>
> Can we use bucketing with sorting functionality to save data incrementally
> (say daily) ? I understand bucketing is supported in Spark only with
> saveAsTable, however can this be used with mode "append" instead of
> "overwrite"?
>
> My understanding around bucketing was, you need to rewrite entire table
> every time, can someone help advice?
>
> --
> Regards,
>
> Rishi Shah
>

Mime
View raw message