kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sachin Mittal <sjmit...@gmail.com>
Subject How are custom keys compared during joins
Date Sat, 07 Dec 2019 06:38:35 GMT
Hi,
I am facing some weird problems when joining two streams or a stream/table.
The joined stream does not contain all the joined records.

Also note that my keys are custom keys for which I have implemented equals
and hashcode method
Is there something else also I need to do to ensure key1 === key2

Let me illustrate it with a example:
//created a new stream with a new custom key
stream1Mapped = stream1.map((k,v) -> ...)
//checked the re-partition data and it has 2 records

// created a new stream with a new custom key and converted a stream to
table using standard way
table2Mapped = stream2.map((k,v) -> ...).groupByKey().reduce((av, nv) ->
nv)
//checked the re-partition and change-log data and it has 3 records

//now I join the stream with table on the new custom key
joinedStream = stream1Mapped.join(table2Mapped, (lv, rv) -> ..)

//printed the data for stream
joinedStream.peek((k, v) -> print(v))
//is called only once ??

This should to be called twice as keys for both the records in the stream
are there in table too.

Please let me know if I understood the case well enough and if there is
anything I can do to debug this problem better.

Thanks
Sachin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message