kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Fouché <nfou...@onfocus.io>
Subject Re: Kafka Streams: got bit by WindowedSerializer (only window.start is serialized)
Date Mon, 16 Jan 2017 22:47:56 GMT
Hi Eno,
I thought it would be impossible to put this in Kafka because of backward
incompatibility with the existing windowed keys, no ?
In my case, I had to recreate a new output topic, reset the topology, and
and reprocess all my data.

2017-01-16 23:05 GMT+01:00 Eno Thereska <eno.thereska@gmail.com>:

> Nicolas,
>
> I'm checking with Bill who originally was interested in KAFKA-4468. If he
> isn't actively working on it, why don't you give it a go and create a pull
> request (PR) for it? That way your contribution is properly acknowledged
> etc. We can help you through with that.
>
> Thanks
> Eno
> > On 16 Jan 2017, at 18:46, Nicolas Fouché <nfouche@onfocus.io> wrote:
> >
> > My current implementation:
> > https://gist.github.com/nfo/eaf350afb5667a3516593da4d48e757a . I just
> > appended the window `end` at the end of the byte array.
> > Comments and suggestions are welcome !
> >
> >
> > 2017-01-16 15:48 GMT+01:00 Nicolas Fouché <nfouche@onfocus.io>:
> >
> >> Hi Damian,
> >>
> >> I recall now that I copied the `WindowedSerde` class [1] from Confluent
> >> examples by Confluent, which uses the internal `WindowedSerializer`
> class.
> >> Better write my own Serde them. You're right, I should not rely on
> >> internal classes, especially for data written outside Kafka Streams
> >> topologies.
> >>
> >> Thanks for the insights on KAFKA-4468.
> >>
> >> https://github.com/confluentinc/examples/blob/
> >> 89db45c6890cf757b8e18565bdf7bc23f119a2ff/kafka-streams/src/
> >> main/java/io/confluent/examples/streams/utils/WindowedSerde.java
> >>
> >> Nicolas.
> >>
> >> 2017-01-16 12:31 GMT+01:00 Damian Guy <damian.guy@gmail.com>:
> >>
> >>> Hi Nicolas,
> >>>
> >>> I guess you are using the Processor API for your topology? The
> >>> WindowedSerializer is an internal class that is used as part of the
> DSL.
> >>> In
> >>> the DSL a topic will be created for each window operation, so we don't
> >>> need
> >>> the end time as it can be calculated from the window size.
> >>> However, there is an open jira for this:
> >>> https://issues.apache.org/jira/browse/KAFKA-4468
> >>>
> >>> Thanks,
> >>> Damian
> >>>
> >>> On Mon, 16 Jan 2017 at 11:18 Nicolas Fouché <nfouche@onfocus.io>
> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> In the same topology, I generate aggregates with 1-day windows and
> >>> 1-week
> >>>> windows and write them in one single topic. On Mondays, these windows
> >>> have
> >>>> the same start time. The effect: these aggregates overrides each
> other.
> >>>>
> >>>> That happens because WindowedSerializer [1] only serializes the window
> >>>> start time. I'm a bit surprised, a window has by definition a start
> and
> >>> an
> >>>> end. I suppose one wanted save on key sizes ? And/or one would
> consider
> >>>> that topics should not contain aggregates with different
> granularities ?
> >>>>
> >>>> I have two choices then, either create as many output topics as I have
> >>>> granularities, or create my own serializer which also includes the
> >>> window
> >>>> end time. What would the community recommend ?
> >>>>
> >>>> Getting back to the core problem:
> >>>> I could understand that it's not "right" to store different
> >>> granularities
> >>>> in one topic, and I thought it would save resources (less topic to
> >>> manage
> >>>> by Kafka). But, I'm really not sure about this default serializer: it
> >>> does
> >>>> not serialize all instance variables of the `Window` class, and more
> >>>> generally does comply to the definition of a window.
> >>>>
> >>>> [1]
> >>>>
> >>>> https://github.com/apache/kafka/blob/0.10.1/streams/src/main
> >>> /java/org/apache/kafka/streams/kstream/internals/
> WindowedSerializer.java
> >>>>
> >>>> Thanks.
> >>>> Nicolas
> >>>>
> >>>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message