spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Nerothin <jasonnerot...@gmail.com>
Subject Re: Structured streaming flatMapGroupWithState results out of order messages when reading from Kafka
Date Tue, 09 Apr 2019 15:54:53 GMT
I had that identical problem. Here’s what I came up with:

https://github.com/ubiquibit-inc/sensor-failure


On Tue, Apr 9, 2019 at 04:37 Akila Wajirasena <akila.wajirasena@gmail.com>
wrote:

> Hi
>
> I have a Kafka topic  which is already loaded with data. I use a stateful
> structured streaming pipeline using flatMapGroupWithState to consume the
> data in kafka in a streaming manner.
>
> However when I set shuffle partition count > 1 I get some out of order
> messages in to each of my GroupState. Is this the expected behavior or is
> the message ordering guaranteed when using flatMapGroupWithState with Kafka
> source?
>
> This my pipline;
>
> Kafka => GroupByKey(key from Kafka schema) => flatMapGroupWithState =>
> parquet
>
> When I printed out the Kafka offset for each key inside my state update
> function they are not in order. I am using spark 2.3.3.
>
> Thanks & Regards,
> Akila
>
>
>
> --
Thanks,
Jason

Mime
View raw message