kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shayne S <shaynest...@gmail.com>
Subject Re: Log compaction not working as expected
Date Tue, 16 Jun 2015 13:49:24 GMT
Thank you for the response!

Unfortunately, those improvements would not help.  It is the lack of
activity resulting in a new segment that prevents compaction.

I was confused by what qualifies as the active segment. The active segment
is the last segment as opposed to the segment that would be written to if
something were received right now.

On Tue, Jun 16, 2015 at 8:38 AM, Manikumar Reddy <kumar@nmsworks.co.in>
wrote:

> Hi,
>
>   Your observation is correct.  we never compact the active segment.
>   Some improvements are proposed here,
>   https://issues.apache.org/jira/browse/KAFKA-1981
>
>
> Manikumar
>
> On Tue, Jun 16, 2015 at 5:35 PM, Shayne S <shaynest113@gmail.com> wrote:
>
> > Some further information, and is this a bug?  I'm using 0.8.2.1.
> >
> > Log compaction will only occur on the non active segments.  Intentional
> or
> > not, it seems that the last segment is always the active segment.  In
> other
> > words, an expired segment will not be cleaned until a new segment has
> been
> > created.
> >
> > As a result, a log won't be compacted until new data comes in (per
> > partition). Does this mean I need to send the equivalent of a pig (
> > https://en.wikipedia.org/wiki/Pigging) through each partition in order
> to
> > force compaction?  Or can I force the cleaning somehow?
> >
> > Here are the steps to recreate:
> >
> > 1. Create a new topic with a 5 minute segment.ms:
> >
> > kafka-topics.sh --zookeeper localhost:2181 --create --topic TEST_TOPIC
> > --replication-factor 1 --partitions 1 --config cleanup.policy=compact
> > --config min.cleanable.dirty.ratio=0.01 --config segment.ms=300000
> >
> > 2. Repeatedly add messages with identical keys (3x):
> >
> > echo "ABC123,{\"test\": 1}" | kafka-console-producer.sh --broker-list
> > localhost:9092 --topic TEST_TOPIC --property parse.key=true --property
> > key.separator=, --new-producer
> >
> > 3. Wait 5+ minutes and confirm no log compaction.
> > 4. Once satisfied, send a new message:
> >
> > echo "DEF456,{\"test\": 1}" | kafka-console-producer.sh --broker-list
> > localhost:9092 --topic TEST_TOPIC --property parse.key=true --property
> > key.separator=, --new-producer
> >
> > 5. Log compaction will occur quickly soon after.
> >
> > Is my use case of infrequent logs not supported? Is this intentional
> > behavior? It's unnecessarily challenging to target each partition with a
> > dummy message to trigger compaction.
> >
> > Also, I believe there is another issue with logs originally configured
> > without a segment timeout that lead to my original issue.  I still cannot
> > get those logs to compact.
> >
> > Thanks!
> > Shayne
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message