spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Teoh <chris.t...@gmail.com>
Subject Re: Implementing Upsert logic Through Streaming
Date Sun, 30 Jun 2019 20:15:51 GMT
Just thinking on this, if your needs can be addressed using batch instead
of streaming, I think this is a viable solution. Using a lambda
architecture approach seems like a possible solution.

On Sun., 30 Jun. 2019, 9:54 am Chris Teoh, <chris.teoh@gmail.com> wrote:

> Not sure what your needs are here.
>
> If you can afford to wait, increase your micro batch windows to a long
> period of time, aggregate your data by key every micro batch and then apply
> those changes to the Oracle database.
>
> Since you're using text file to stream, there's no way to pre partition
> your stream. If you're using Kafka, you could partition by record key and
> do the summarisation that way before applying the changes to Oracle.
>
> I hope that helps.
>
> On Tue., 25 Jun. 2019, 9:43 pm Sachit Murarka, <connectsachit@gmail.com>
> wrote:
>
>> Hi All,
>>
>> I will get records continously in text file form(Streaming). It will have
>> timestamp as field also.
>>
>> Target is Oracle Database.
>>
>> My Goal is to maintain latest record for a key in Oracle. Could you
>> please suggest how this can be implemented efficiently?
>>
>> Kind Regards,
>> Sachit Murarka
>>
>

Mime
View raw message