spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jungtaek Lim <kabhwan.opensou...@gmail.com>
Subject Re: Does StreamingSymmetricHashJoinExec work with watermark? I don't think so
Date Fri, 15 Nov 2019 06:28:39 GMT
Jacek,

would you mind if I ask for the query to reproduce? Not sure I get you
without having the example of "not working".

Thanks,
Jungtaek Lim (HeartSaVioR)

On Tue, Nov 12, 2019 at 12:04 AM Jacek Laskowski <jacek@japila.pl> wrote:

> Hi,
>
> I think watermark does not work for StreamingSymmetricHashJoinExec because
> of the following:
>
> 1. leftKeys and rightKeys have no spark.watermarkDelayMs metadata entry at
> planning [1]
> 2. Since the left and right keys had no watermark delay at planning the
> code [2] won't find it at execution
>
> Is my understanding correct? If not, can you point me at examples with
> watermark on 1) join keys and 2) values ? Merci beaucoup.
>
> [1]
> https://github.com/apache/spark/blob/v3.0.0-preview/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala#L477-L478
>
> [2]
> https://github.com/apache/spark/blob/v3.0.0-preview/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinHelper.scala#L156-L164
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://about.me/JacekLaskowski
> The Internals of Spark SQL https://bit.ly/spark-sql-internals
> The Internals of Spark Structured Streaming
> https://bit.ly/spark-structured-streaming
> The Internals of Apache Kafka https://bit.ly/apache-kafka-internals
> Follow me at https://twitter.com/jaceklaskowski
>
>

Mime
View raw message