spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject Re: Update Batch DF with Streaming
Date Mon, 20 Jun 2016 16:09:56 GMT
Hi,

How would you do that without/outside streaming?

Jacek
On 17 Jun 2016 12:12 a.m., "Amit Assudani" <aassudani@impetus.com> wrote:

> Hi All,
>
>
> Can I update batch data frames loaded in memory with Streaming data,
>
>
> For eg,
>
>
> I have employee DF is registered as temporary table, it has EmployeeID,
> Name, Address, etc. fields,  and assuming it is very big and takes time to
> load in memory,
>
>
> I've two types of employee events (both having empID bundled in
> payload) coming in streams,
>
>
> 1) which looks up  for a particular empID in batch data and does some
> calculation and persist the results,
>
> 2) which has updated values of some of the fields for an empID,
>
>
> Now I want to keep the employee DF up to date with the updates coming in
> type 2 events for future type 1 events to use,
>
>
> Now the question is can I update the employee DF with type 2 events in
> memory ? Do I need the whole DF refresh ?
>
>
> p.s. I can join the stream with batch and get the joined table, but i am
> not sure how to get and use the handle of joined data for subsequent
> events,
>
>
> Regards,
>
> Amit
>
> ------------------------------
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>

Mime
View raw message