spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject Re: [SS][2.4.4] Confused with "WatermarkTracker: Event time watermark didn't move"?
Date Mon, 14 Oct 2019 20:12:39 GMT
Hi HeartSaVioR,

> It might be due to empty batch

Yeah...that's my understanding too. It's for a streaming aggregation in
Append output mode so that's possible. I'll have a closer look at it.

Thanks much for keeping up with this and the other questions. Much
appreciated!

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming
https://bit.ly/spark-structured-streaming
The Internals of Apache Kafka https://bit.ly/apache-kafka-internals
Follow me at https://twitter.com/jaceklaskowski



On Mon, Oct 14, 2019 at 12:42 AM Jungtaek Lim <kabhwan.opensource@gmail.com>
wrote:

> It might be due to empty batch (activated when there're stateful
> operator(s) and the previous batch advances watermark), which has no input
> so no moving watermark.
>
> Did you only turn on DEBUG for WatermarkTracker? If you turn on DEBUG for
> MicroBatchExecution as well, it would log "Completed batch <batchId>" so if
> I'm not missing, it should be logged between updating event-time watermark
> and watermark didn't move. You can attach streaming query listener and get
> more information about batches.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> On Tue, Oct 8, 2019 at 6:12 PM Jacek Laskowski <jacek@japila.pl> wrote:
>
>> Hi,
>>
>> I haven't spent much time on it, but the following DEBUG message
>> from WatermarkTracker sparked my interest :)
>>
>> I ran a streaming aggregation in Append mode and got the messages:
>>
>> 19/10/08 10:48:56 DEBUG WatermarkTracker: Observed event time stats 0:
>> EventTimeStats(15000,1000,8000.0,2)
>> 19/10/08 10:48:56 INFO WatermarkTracker: Updating event-time watermark
>> from 0 to 5000 ms
>> 19/10/08 10:48:56 DEBUG WatermarkTracker: Event time watermark didn't
>> move: 5000 < 5000
>>
>> I think the DEBUG message "Event time watermark didn't move" seems
>> incorrect given that the query has just started and "Observed event time
>> stats". It's true that the event-time watermark didn't move if it was 5000
>> before, but it was not as it has just started from scratch (no checkpointed
>> state).
>>
>> Can anyone shed some light on this? I'll be digging deeper in a bit, but
>> am hoping to get some more info before. Thanks!
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://about.me/JacekLaskowski
>> The Internals of Spark SQL https://bit.ly/spark-sql-internals
>> The Internals of Spark Structured Streaming
>> https://bit.ly/spark-structured-streaming
>> The Internals of Apache Kafka https://bit.ly/apache-kafka-internals
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>

Mime
View raw message