kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guy Doulberg <Guy.Doulb...@perion.com>
Subject RE: Consume more than produce
Date Mon, 04 Aug 2014 11:11:49 GMT
Hi Daniel 

I count once when producing and count once when consuming, the timestamp is calculated once
before producing, and it is being attached to the msg so the consumer will use the same TS
to count 

Thanks

-----Original Message-----
From: Daniel Compton [mailto:desk@danielcompton.net] 
Sent: Monday, August 04, 2014 12:35 PM
To: users@kafka.apache.org
Subject: Re: Consume more than produce

Hi Guy

In your reconciliation, where was the time stamp coming from? Is it possible that messages
were delivered several times but your calculations only counted each unique event?

Daniel.

> On 4/08/2014, at 5:35 pm, Guy Doulberg <Guy.Doulberg@perion.com> wrote:
> 
> Hi
> 
> What do you mean producer ACK value?
> 
> In my code I don't have a retry mechanism, the Kafka producer API has a retry mechanism?
> 
> 
> -----Original Message-----
> From: Guozhang Wang [mailto:wangguoz@gmail.com]
> Sent: Friday, August 01, 2014 6:08 PM
> To: users@kafka.apache.org
> Subject: Re: Consume more than produce
> 
> What is the ack value used in the producer?
> 
> 
> On Fri, Aug 1, 2014 at 1:28 AM, Guy Doulberg <Guy.Doulberg@perion.com>
> wrote:
> 
>> Hey,
>> 
>> 
>> After a year or so I have Kafka as my streaming layer in my 
>> production, I decided it is time to audit, and to test how many 
>> events do I lose, if I lose events at all.
>> 
>> 
>> I discovered something interesting which I can't explain.
>> 
>> 
>> The producer produces less events that the consumer group consumes.
>> 
>> 
>> It is not much more, it is about 0.1% more events
>> 
>> 
>> I use the Consumer API (not the simple consumer API)
>> 
>> 
>> I was thinking I might had rebalancing going on in my system, but it 
>> doesn't look like that.
>> 
>> 
>> Did anyone see such a behaviour
>> 
>> 
>> In order to audit, I calculated for each event the minute it arrived, 
>> and assigned this value to the event, I used statsd do to count all 
>> events from all my producer cluster, and all consumer group cluster.
>> 
>> 
>> I must say that it is not a happening for every minute,
>> 
>> 
>> Thanks, Guy
> 
> 
> --
> -- Guozhang

Mime
View raw message