kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shayne S <shaynest...@gmail.com>
Subject Re: Log compaction not working as expected
Date Wed, 17 Jun 2015 12:26:05 GMT
Right, you can see I've got segment.ms set.  The trick is that they don't
actually roll over until something new arrives. If your topic is idle (not
receiving messages), it won't ever roll over to a new segment, and thus the
last segment will never be compacted.

Thanks!
Shayne

On Wed, Jun 17, 2015 at 5:58 AM, Jan Filipiak <Jan.Filipiak@trivago.com>
wrote:

> Hi,
>
> you might want to have a look here:
> http://kafka.apache.org/documentation.html#topic-config
> _segment.ms_ and _segment.bytes _ should allow you to control the
> time/size when segments are rolled.
>
> Best
> Jan
>
>
> On 16.06.2015 14:05, Shayne S wrote:
>
>> Some further information, and is this a bug?  I'm using 0.8.2.1.
>>
>> Log compaction will only occur on the non active segments.  Intentional or
>> not, it seems that the last segment is always the active segment.  In
>> other
>> words, an expired segment will not be cleaned until a new segment has been
>> created.
>>
>> As a result, a log won't be compacted until new data comes in (per
>> partition). Does this mean I need to send the equivalent of a pig (
>> https://en.wikipedia.org/wiki/Pigging) through each partition in order to
>> force compaction?  Or can I force the cleaning somehow?
>>
>> Here are the steps to recreate:
>>
>> 1. Create a new topic with a 5 minute segment.ms:
>>
>> kafka-topics.sh --zookeeper localhost:2181 --create --topic TEST_TOPIC
>> --replication-factor 1 --partitions 1 --config cleanup.policy=compact
>> --config min.cleanable.dirty.ratio=0.01 --config segment.ms=300000
>>
>> 2. Repeatedly add messages with identical keys (3x):
>>
>> echo "ABC123,{\"test\": 1}" | kafka-console-producer.sh --broker-list
>> localhost:9092 --topic TEST_TOPIC --property parse.key=true --property
>> key.separator=, --new-producer
>>
>> 3. Wait 5+ minutes and confirm no log compaction.
>> 4. Once satisfied, send a new message:
>>
>> echo "DEF456,{\"test\": 1}" | kafka-console-producer.sh --broker-list
>> localhost:9092 --topic TEST_TOPIC --property parse.key=true --property
>> key.separator=, --new-producer
>>
>> 5. Log compaction will occur quickly soon after.
>>
>> Is my use case of infrequent logs not supported? Is this intentional
>> behavior? It's unnecessarily challenging to target each partition with a
>> dummy message to trigger compaction.
>>
>> Also, I believe there is another issue with logs originally configured
>> without a segment timeout that lead to my original issue.  I still cannot
>> get those logs to compact.
>>
>> Thanks!
>> Shayne
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message