spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Ammann <...@pyx.ch>
Subject Re: Spark structured streaming leftOuter join not working as I expect
Date Mon, 10 Jun 2019 22:06:43 GMT
Hi all

it took me some time to get the issues extracted into a piece of standalone code. I created
the following gist

https://gist.github.com/jammann/b58bfbe0f4374b89ecea63c1e32c8f17

I has messages for 4 topics A/B/C/D and a simple Python program which shows 6 use cases, with
my expectations and observations with Spark 2.4.3

It would be great if you could have a look and check if I'm doing something wrong, or this
is indeed a limitation of Spark?

On 6/5/19 5:35 PM, Jungtaek Lim wrote:
> Nice to hear you're investigating the issue deeply.
> 
> Btw, if attaching code is not easy, maybe you could share logical/physical plan on any
batch: "detail" in SQL tab would show up the plan as string. Plans from sequential batches
would be much helpful - and streaming query status in these batch (especially watermark) should
be helpful too.
> 


-- 
CU, Joe

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message