kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bartosz Konieczny <bartkoniec...@gmail.com>
Subject Logs compaction
Date Sat, 28 May 2016 05:27:27 GMT
Hello,

I'm studying the part about logs retention. For the delete I've no problems
to see what's going on. However, this is more tricky for compaction. I come
to you with some questions about it:

1) In the documentation I can see that putting null key/payload will be
used as a 'delete' marker:
"Compaction also allows for deletes. A message with a key and a null
payload will be treated as a delete from the log. This delete marker will
cause any prior message with that key to be removed (as would any new
message with that key), but delete markers are special in that they will
themselves be cleaned out of the log after a period of time to free up
space."

Let's suppose we have following messages: M1, M2, M3, M4, null, M5, M6.
Now, could you tell me if my understanding is correct for below cases ?
* For the first case, {M1, M2, M3, M4, null} are in inactive segment.
Logically, they should be removed, right ?
* For the second case, {M1, M2, M3, M4, null, M5} are in inactive segment
and {M6} is in active one. LogCleaner should once again remove M1-null and
leave only M5 and M6 (with potential merging of these two messages to a
single one active segment) ?

2) Log compaction will only 'remove' duplicated messages ? All other
messages (including the ones after deduplication) will be kept infinitely ?



Best regards,
Bartosz.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message