spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Teoh <chris.t...@gmail.com>
Subject Re: Implementing Upsert logic Through Streaming
Date Sat, 29 Jun 2019 23:54:59 GMT
Not sure what your needs are here.

If you can afford to wait, increase your micro batch windows to a long
period of time, aggregate your data by key every micro batch and then apply
those changes to the Oracle database.

Since you're using text file to stream, there's no way to pre partition
your stream. If you're using Kafka, you could partition by record key and
do the summarisation that way before applying the changes to Oracle.

I hope that helps.

On Tue., 25 Jun. 2019, 9:43 pm Sachit Murarka, <connectsachit@gmail.com>
wrote:

> Hi All,
>
> I will get records continously in text file form(Streaming). It will have
> timestamp as field also.
>
> Target is Oracle Database.
>
> My Goal is to maintain latest record for a key in Oracle. Could you please
> suggest how this can be implemented efficiently?
>
> Kind Regards,
> Sachit Murarka
>

Mime
View raw message