spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject Does StreamingSymmetricHashJoinExec work with watermark? I don't think so
Date Mon, 11 Nov 2019 15:04:23 GMT
Hi,

I think watermark does not work for StreamingSymmetricHashJoinExec because
of the following:

1. leftKeys and rightKeys have no spark.watermarkDelayMs metadata entry at
planning [1]
2. Since the left and right keys had no watermark delay at planning the
code [2] won't find it at execution

Is my understanding correct? If not, can you point me at examples with
watermark on 1) join keys and 2) values ? Merci beaucoup.

[1]
https://github.com/apache/spark/blob/v3.0.0-preview/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala#L477-L478

[2]
https://github.com/apache/spark/blob/v3.0.0-preview/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinHelper.scala#L156-L164

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming
https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals
Follow me at https://twitter.com/jaceklaskowski

Mime
View raw message