spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sachit Murarka <connectsac...@gmail.com>
Subject Re: Implementing Upsert logic Through Streaming
Date Mon, 01 Jul 2019 04:04:16 GMT
Hi Chris,

I have to make sure my DB has updated value for any record at a given point
of time.
Say following is data. I have to take 4th row for EmpId 2.
Also if any Emp details are already there in Oracle.  I have to update it
with latest value in the stream.

EmpId,  salary,  timestamp
1, 1000 , 1234
2, 2000, 2234
3, 2000,3234
2, 2100,4234

Thanks
Sachit

On Mon, 1 Jul 2019, 01:46 Chris Teoh, <chris.teoh@gmail.com> wrote:

> Just thinking on this, if your needs can be addressed using batch instead
> of streaming, I think this is a viable solution. Using a lambda
> architecture approach seems like a possible solution.
>
> On Sun., 30 Jun. 2019, 9:54 am Chris Teoh, <chris.teoh@gmail.com> wrote:
>
>> Not sure what your needs are here.
>>
>> If you can afford to wait, increase your micro batch windows to a long
>> period of time, aggregate your data by key every micro batch and then apply
>> those changes to the Oracle database.
>>
>> Since you're using text file to stream, there's no way to pre partition
>> your stream. If you're using Kafka, you could partition by record key and
>> do the summarisation that way before applying the changes to Oracle.
>>
>> I hope that helps.
>>
>> On Tue., 25 Jun. 2019, 9:43 pm Sachit Murarka, <connectsachit@gmail.com>
>> wrote:
>>
>>> Hi All,
>>>
>>> I will get records continously in text file form(Streaming). It will
>>> have timestamp as field also.
>>>
>>> Target is Oracle Database.
>>>
>>> My Goal is to maintain latest record for a key in Oracle. Could you
>>> please suggest how this can be implemented efficiently?
>>>
>>> Kind Regards,
>>> Sachit Murarka
>>>
>>

Mime
View raw message