kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damian Guy <damian....@gmail.com>
Subject Re: Kafka Streams: got bit by WindowedSerializer (only window.start is serialized)
Date Mon, 16 Jan 2017 11:31:07 GMT
Hi Nicolas,

I guess you are using the Processor API for your topology? The
WindowedSerializer is an internal class that is used as part of the DSL. In
the DSL a topic will be created for each window operation, so we don't need
the end time as it can be calculated from the window size.
However, there is an open jira for this:


On Mon, 16 Jan 2017 at 11:18 Nicolas Fouché <nfouche@onfocus.io> wrote:

> Hi,
> In the same topology, I generate aggregates with 1-day windows and 1-week
> windows and write them in one single topic. On Mondays, these windows have
> the same start time. The effect: these aggregates overrides each other.
> That happens because WindowedSerializer [1] only serializes the window
> start time. I'm a bit surprised, a window has by definition a start and an
> end. I suppose one wanted save on key sizes ? And/or one would consider
> that topics should not contain aggregates with different granularities ?
> I have two choices then, either create as many output topics as I have
> granularities, or create my own serializer which also includes the window
> end time. What would the community recommend ?
> Getting back to the core problem:
> I could understand that it's not "right" to store different granularities
> in one topic, and I thought it would save resources (less topic to manage
> by Kafka). But, I'm really not sure about this default serializer: it does
> not serialize all instance variables of the `Window` class, and more
> generally does comply to the definition of a window.
> [1]
> https://github.com/apache/kafka/blob/0.10.1/streams/src/main/java/org/apache/kafka/streams/kstream/internals/WindowedSerializer.java
> Thanks.
> Nicolas

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message