spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ayan guha <>
Subject Fwd: Delta with intelligent upsett
Date Thu, 31 Oct 2019 21:14:51 GMT

we have a scenario where we have a large table  ie 5-6B records. The table
is repository of data from past N years. It is possible that some updates
take place on the data and thus er are using Delta table.

As part of the business process we know updates can happen only within M
years of past records where M is much smaller than N. Eg the table can hold
20 yrs of data but we know updates can happen only for last year not before

Is there some way to indicate this additional intelligence to Delta so it
can look into only last years data while running a merge or update? It
seems to be an obvious performance booster.

Any thoughts?
Best Regards,
Ayan Guha
Best Regards,
Ayan Guha

View raw message