spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gourav Sengupta <gourav.sengu...@gmail.com>
Subject Re: Delta with intelligent upsett
Date Fri, 01 Nov 2019 06:52:19 GMT
should not a where clause on the partition field help with that? I am
obviously missing something in the question.

Regards,
Gourav

On Thu, Oct 31, 2019 at 9:15 PM ayan guha <guha.ayan@gmail.com> wrote:

>
> Hi
>
> we have a scenario where we have a large table  ie 5-6B records. The table
> is repository of data from past N years. It is possible that some updates
> take place on the data and thus er are using Delta table.
>
> As part of the business process we know updates can happen only within M
> years of past records where M is much smaller than N. Eg the table can hold
> 20 yrs of data but we know updates can happen only for last year not before
> that.
>
> Is there some way to indicate this additional intelligence to Delta so it
> can look into only last years data while running a merge or update? It
> seems to be an obvious performance booster.
>
> Any thoughts?
> --
> Best Regards,
> Ayan Guha
> --
> Best Regards,
> Ayan Guha
>

Mime
View raw message