kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Gould <mikeyg...@gmail.com>
Subject Re: compaction + delete not working for me
Date Tue, 10 Jan 2017 13:18:47 GMT
That's great thank you I have it working
One other thing I noticed; if I send a batch of data then wait then
compaction never happens. If I send a few more messages later then the
first batch gets compacted. I guess it needs a constant flow to trigger
compaction of completed segments. So it shows that my test doesn't match
real life. 😃


On Fri, 6 Jan 2017 at 21:36, Ewen Cheslack-Postava <ewen@confluent.io>
wrote:

> On Fri, Jan 6, 2017 at 3:57 AM, Mike Gould <mikeyg123@gmail.com> wrote:
>
>
>
> > Hi
>
> >
>
> > I'm trying to configure log compaction + deletion as per KIP-71 in kafka
>
> > 0.10.1 but so far haven't had any luck. My tests show more than 50%
>
> > duplicate keys when reading from the beginning even several minutes after
>
> > all the events were sent.
>
> > The documentation in section 3.1 doesn't seem very clear to me in terms
> of
>
> > exactly how to configure particular behavior. Could someone please
> clarify
>
> > a few things for me?
>
> >
>
> > In order to significantly reduce the amount of data that new subscribers
>
> > have to receive I want to compact events as soon as possible, and delete
>
> > any events more than 24 hours old (e.g if there hasn't been an update
> with
>
> > a matching key for 24h).
>
> >
>
> > I have set
>
> >
>
> > cleanup.policy=compact, delete
>
> > min.cleanable.dirty.ratio=0.5
>
> > min.compaction.lag.ms=0
>
> > retention.ms=86400000
>
> > delete.retention.ms=86460000
>
> > segment.ms=60000
>
> >
>
> >
>
> >    - Should the cleanup.policy be "compact,delete" or "compact, delete"
> or
>
> >    something else?
>
> >
>
>
>
> Either should work, extra leading and trailing spaces are removed.
>
>
>
>
>
> >    - Are events eligible for compaction soon after the
>
> > min.compaction.lag.ms
>
> >    time and segment.ms or is there another parameter that affects this?
>
>
>
>    I.e. if I read from the beginning after a couple of minutes should I see
>
> > no
>
> >    more than 50% of the events received have the same key as previous
>
> > events.
>
> >
>
>
>
> Maybe you need to modify log.retention.check.interval.ms? It defaults to 5
>
> minutes. The log cleaning runs periodically, so you may just not have
>
> waited log enough for cleaning to have executed.
>
>
>
>
>
> >    - Does the retention.ms parameter only affect the deletion?
>
> >    - How can I tell if the config is accepted and compaction is working?
> Is
>
> >    there something useful to search for in the logs?
>
> >
>
>
>
> Check for logs from LogCleaner.scala. It should log some info when it runs.
>
>
>
>
>
> >    - Also if I change the topic config via the kafka-configs.sh tool does
>
> >    the change take effect immediately for existing events, do I have to
>
> >    restart the brokers, or does it only affect new events?
>
> >
>
>
>
> Topic config changes shouldn't need a broker restart.
>
>
>
> -Ewen
>
>
>
>
>
> >
>
> > Thank you
>
> > Mike G
>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message