spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stevo Slavić <ssla...@gmail.com>
Subject Re: New consumer - offset one gets in poll is not offset one is supposed to commit
Date Fri, 24 Jul 2015 16:56:14 GMT
Hello Cody,

I'm not sure we're talking about same thing.

Since you're mentioning streams I guess you were referring to current high
level consumer, while I'm talking about new yet unreleased high level
consumer.

Poll I was referring to is
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L715

ConsumerRecord offset I was referring to is
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecord.java#L22

poll returns ConsumerRecords, per TopicPartition collection of
ConsumerRecord. And in example I gave ConsumerRecord offset would be 0.

Commit I was referring to is
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L805

After processing read ConsumerRecord, commit expects me to submit not
offset 0 but offset 1...

Kind regards,
Stevo Slavic

On Fri, Jul 24, 2015 at 6:31 PM, Cody Koeninger <cody@koeninger.org> wrote:

> Well... there are only 2 hard problems in computer science: naming things,
> cache invalidation, and off-by-one errors.
>
> The direct stream implementation isn't asking you to "commit" anything.
> It's asking you to provide a starting point for the stream on startup.
>
> Because offset ranges are inclusive start, exclusive end, it's pretty
> natural to use the end of the previous offset range as the beginning of the
> next.
>
>
> On Fri, Jul 24, 2015 at 11:13 AM, Stevo Slavić <sslavic@gmail.com> wrote:
>
>> Hello Apache Kafka community,
>>
>> Say there is only one topic with single partition and a single message on
>> it.
>> Result of calling a poll with new consumer will return ConsumerRecord for
>> that message and it will have offset of 0.
>>
>> After processing message, current KafkaConsumer implementation expects
>> one to commit not offset 0 as processed, but to commit offset 1 - next
>> offset/position one would like to consume.
>>
>> Does this sound strange to you as well?
>>
>> Wondering couldn't this offset+1 handling for next position to read been
>> done in one place, in KafkaConsumer implementation or broker or whatever,
>> instead of every user of KafkaConsumer having to do it.
>>
>> Kind regards,
>> Stevo Slavic.
>>
>
>

Mime
View raw message