kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Garcia <dav...@spiceworks.com>
Subject Re: puncutuate() bug
Date Sat, 08 Oct 2016 19:29:35 GMT
Actually, I think the bug is more subtle.  What happens when a consumed topic stops receiving
messages?  The smallest timestamp will always be the static timestamp of this topic.


On 10/7/16, 5:03 PM, "David Garcia" <davidg@spiceworks.com> wrote:

    Ok I found the bug.  Basically, if there is an empty topic (in the list of topics being
consumed), any partition-group with partitions from the topic will always return -1 as the
smallest timestamp (see PartitionGroup.java).
    To reproduce, simply start a kstreams consumer with one or more empty topics.  Punctuate
will never be called.
    On 10/7/16, 1:11 PM, "David Garcia" <davidg@spiceworks.com> wrote:
        Yeah, this is possible.  We have run the application (and have confirmed data is being
received) for over 30 mins…with a 60-second timer.  So, do we need to just rebuild our cluster
with bigger machines?
        On 10/7/16, 11:18 AM, "Michael Noll" <michael@confluent.io> wrote:
            punctuate() is still data-driven at this point, even when you're using the
            WallClock timestamp extractor.
            To use an example: Imagine you have configured punctuate() to be run every
            5 seconds.  If there's no data being received for a minute, then punctuate
            won't be called -- even though you probably would have expected this to
            happen 12 times during this 1 minute.
            (FWIW, there's an ongoing discussion to improve punctuate(), part of which
            is motivated by the current behavior that arguably is not very intuitive to
            many users.)
            Could this be the problem you're seeing?  See also the related discussion
            On Fri, Oct 7, 2016 at 6:07 PM, David Garcia <davidg@spiceworks.com> wrote:
            > Hello, I’m sure this question has been asked many times.
            > We have a test-cluster (confluent 3.0.0 release) of 3 aws m4.xlarges.  We
            > have an application that needs to use the punctuate() function to do some
            > work on a regular interval.  We are using the WallClock extractor.
            > Unfortunately, the method is never called.  I have checked the
            > filedescriptor setting for both the user as well as the process, and
            > everything seems to be fine.  Is this a known bug, or is there something
            > obvious I’m missing?
            > One note, the application used to work on this cluster, but now it’s not
            > working.  Not really sure what is going on?
            > -David

View raw message