spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rishi Mishra <rmis...@snappydata.io>
Subject Re: Joining streaming data with static table data.
Date Tue, 12 Dec 2017 07:29:55 GMT
You can do a join between streaming dataset and a static dataset. I would
prefer your first approach. But the problem with this approach is
performance.
Unless you cache the dataset , every time you fire a join query it will
fetch the latest records from the table.



Regards,
Rishitesh Mishra,
SnappyData . (http://www.snappydata.io/)

https://in.linkedin.com/in/rishiteshmishra

On Tue, Dec 12, 2017 at 6:29 AM, satyajit vegesna <
satyajit.apasprk@gmail.com> wrote:

> Hi All,
>
> I working on real time reporting project and i have a question about
> structured streaming job, that is going to stream a particular table
> records and would have to join to an existing table.
>
> Stream ----> query/join to another DF/DS ---> update the Stream data
> record.
>
> Now i have a problem on how do i approach the mid layer(query/join to
> another DF/DS), should i create a DF from spark.read.format("JDBC") or
> "stream and maintain the data in memory sink" or if there is any better way
> to do it.
>
> Would like to know, if anyone has faced a similar scenario and have any
> suggestion on how to go ahead.
>
> Regards,
> Satyajit.
>

Mime
View raw message