kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Ward <tim.w...@origamienergy.com>
Subject Timing state changes?
Date Thu, 06 Sep 2018 16:00:12 GMT
Hi, looking for some beginner's help with architectural choices.

Scenario 1:

A stream of messages arrives on a Kafka topic. Each has the key being a widget ID, and the
value being an indication of whether the widget is connected or disconnected. For each widget
a message arrives every 30 seconds with the current connected state.

Requirement: emit a message when a widget has been reported as disconnected for at least five
minutes.

Constraints: there can be large numbers of widgets, so horizontal scalability is required.
The solution must be robust against the application, or instances of it, crashing, so persistence
or regeneration of any local/internal state is required, and committing the input messages
has to be got right so that they're reprocessed on a restart if and as necessary.

Observation: without the constraints we'd just read each message, keep an in-memory list of
disconnected widgets with the timestamp at which they went disconnected, then when another
"disconnected" status comes in emit the output message if this widget has now been disconnected
for more than five minutes.

It looks like there's stuff in Kafka Streams that can help here, but I can't see quite where
to start?

Scenario 2:

A stream of messages arrives on a Kafka topic. Each has the key being a widget ID, and the
value being an indication of whether the widget is connected or disconnected. For each widget
a message arrives only when the connected state changes.

Requirement and constraints: as above

Observation: without the constraints we'd just read each message, keep an in-memory list of
widgets whose most recent message indicated a disconnection, and for each of these start a
five minute timer, and if the timer went off before a "reconnected" message was received emit
the output message.

I'm less clear that there's anything out of the box in Kafka Streams that can do this sort
of timeout of non-existent messages? - but I don't understand from the documentation how the
various windowing features work.

Note: the numbers are for the sake of illustration. I can imagine different solutions being
appropriate depending on whether (a) the timeout (five minutes as quoted) is, whilst configurable,
guaranteed to be the same for every widget, or (b) there is a requirement to configure the
timeout separately for each widget.

(Solutions involving updating a database table and then polling it on a cron job are really
the sort of thing we're trying to get away from here.)

Tim Ward

The contents of this email and any attachment are confidential to the intended recipient(s).
If you are not an intended recipient: (i) do not use, disclose, distribute, copy or publish
this email or its contents; (ii) please contact the sender immediately; and (iii) delete this
email. Our privacy policy is available here: https://origamienergy.com/privacy-policy/. Origami
Energy Limited (company number 8619644); Origami Storage Limited (company number 10436515)
and OSSPV001 Limited (company number 10933403), each registered in England and each with a
registered office at: Ashcombe Court, Woolsack Way, Godalming, GU7 1LQ.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message