spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ayan guha <guha.a...@gmail.com>
Subject Fwd: Delta with intelligent upsett
Date Thu, 31 Oct 2019 21:14:51 GMT
Hi

we have a scenario where we have a large table  ie 5-6B records. The table
is repository of data from past N years. It is possible that some updates
take place on the data and thus er are using Delta table.

As part of the business process we know updates can happen only within M
years of past records where M is much smaller than N. Eg the table can hold
20 yrs of data but we know updates can happen only for last year not before
that.

Is there some way to indicate this additional intelligence to Delta so it
can look into only last years data while running a merge or update? It
seems to be an obvious performance booster.

Any thoughts?
-- 
Best Regards,
Ayan Guha
-- 
Best Regards,
Ayan Guha

Mime
View raw message