kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sachin Mittal <sjmit...@gmail.com>
Subject Understanding how joins work in Kafka streams
Date Sun, 09 Oct 2016 07:19:30 GMT
Hi,
I needed some light on how joins actually work on continuous stream of data.

Say I have 2 topics which I need to join (left join).
Data record in each topic is aggregated like (key, value) <=> (string, list)

Topic 1
key1: [A01, A02, A03, A04 ..]
Key2: [A11, A12, A13, A14 ..]
....

Topic 2
key1: [B01, B02, B03, B04 ..]
Key2: [B11, B12, B13, B14 ..]
....

Joined topic
Key1: [A01, B01...]
Key2: [A11, B11 ...]

Now let us say I get 2 records [Key1: A05] & [Key1: B05]
So as per aggregation they are appended to the Topic 1 and Topic 2.

I assume this will again call the join operation and the records would get
appended to Key1 data? Let me know if my understanding is correct here.

If I am reading the joined topic using foreach will I again get record for
key1 with new appended data in the original list so now my record is
Key1: [A01, B01..., A05, B05 ... ]

What I wanted to ask was in case of reading each record from a topic, if
the value against that key is modified will it be read again (if it was
read before also)?
Or the record is read only once via that stream program?

Please let me know how such a scenario works.

Thanks
Sachin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message