kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Supekar <ni...@ecinity.com>
Subject Re: CallbackHandler in Kafka 0.8
Date Thu, 04 Jul 2013 12:44:02 GMT
Thanks Joel for the response. Sure, I can achieve same by using some data
structure before we push messages to Kafka.
I was hoping if I can utilize any kafka capabilities specially as we are
using async producer.

thanks


On Wed, Jul 3, 2013 at 11:18 PM, Joel Koshy <jjkoshy.w@gmail.com> wrote:

> Hi Nitin,
>
>
> > We receive events from external source ( for example, facebook status
> > update events). These events are pushed to kafka queue when received
> There
> > is possibility of duplicate event ( multiple facebook status update
> events
> > for same account  in quick intervals ) coming again and gets pushed into
> > kafka queue. At consumer end, we do not want to process duplicate events
> (
> > connect to facebook and fetch the status). We would prefer not to have
> > another data structure to  single instance the events received. There is
> > time limit in which whatever events received, we want to do single
> > instancing ( at once fetch all the status update events received in five
> > minutes for single account ). In async producer, events are anyways not
> > written to broker synchronously. They are batched and then pushed after
> > predefined time interval. That's perfect for us. We just want to look at
> > the batch and delete duplicate events from it and then push it broker.
>
> However, this depends on your event rate - i.e., if it's a very high
> event rate then the batch threshold is reached and send happens before
> that time interval. Furthermore, even if event rate is low, the
> de-duplication applies to a time-window of at most
> queue.buffering.max.ms).
>
> In 0.7, batches would be taken and the call-back handler's
> beforeSendingData method would be invoked on those batches. i.e., you
> can achieve the same effect as you did in 0.7 by keeping keeping an
> additional buffer (before the actual send) of
> queue.buffering.max.messages (i.e., the batch size) and de-duping
> within that buffer.
>
> Thanks,
>
> Joel
>
> > On Wed, Jul 3, 2013 at 3:07 AM, Joel Koshy <jjkoshy.w@gmail.com> wrote:
> >
> >> Callback handlers are no longer supported in 0.8. Can you go into why
> >> the filtering needs to be done at this stage as opposed to before
> >> actually sending to the producer?
> >>
> >> Thanks,
> >>
> >> Joel
> >>
> >> On Tue, Jul 2, 2013 at 10:41 AM, Nitin Supekar <nitin@ecinity.com>
> wrote:
> >> > Hello-
> >> >
> >> >    Is CallbackHandler supported in Kafka 0.8 for async producers?
> >> >
> >> > If yes, can I use it to alter the batched messages before they are
> pushed
> >> > to broker? For example, I may want to delete some of the messages in
> the
> >> > batch based on some business logic in my application?
> >> >
> >> > If no, is there any alternate way? I want to do some kind of single
> >> > instancing on messages pushed in kafka in last X minutes.
> >> >
> >> > thanks
> >>
>
>
> On Wed, Jul 3, 2013 at 1:33 AM, Nitin Supekar <nitin@ecinity.com> wrote:
> > Hello-
> >
> > We receive events from external source ( for example, facebook status
> > update events). These events are pushed to kafka queue when received
> There
> > is possibility of duplicate event ( multiple facebook status update
> events
> > for same account  in quick intervals ) coming again and gets pushed into
> > kafka queue. At consumer end, we do not want to process duplicate events
> (
> > connect to facebook and fetch the status). We would prefer not to have
> > another data structure to  single instance the events received. There is
> > time limit in which whatever events received, we want to do single
> > instancing ( at once fetch all the status update events received in five
> > minutes for single account ). In async producer, events are anyways not
> > written to broker synchronously. They are batched and then pushed after
> > predefined time interval. That's perfect for us. We just want to look at
> > the batch and delete duplicate events from it and then push it broker.
> >
> > Possible?
> >
> > thanks
> >
> >
> > On Wed, Jul 3, 2013 at 3:07 AM, Joel Koshy <jjkoshy.w@gmail.com> wrote:
> >
> >> Callback handlers are no longer supported in 0.8. Can you go into why
> >> the filtering needs to be done at this stage as opposed to before
> >> actually sending to the producer?
> >>
> >> Thanks,
> >>
> >> Joel
> >>
> >> On Tue, Jul 2, 2013 at 10:41 AM, Nitin Supekar <nitin@ecinity.com>
> wrote:
> >> > Hello-
> >> >
> >> >    Is CallbackHandler supported in Kafka 0.8 for async producers?
> >> >
> >> > If yes, can I use it to alter the batched messages before they are
> pushed
> >> > to broker? For example, I may want to delete some of the messages in
> the
> >> > batch based on some business logic in my application?
> >> >
> >> > If no, is there any alternate way? I want to do some kind of single
> >> > instancing on messages pushed in kafka in last X minutes.
> >> >
> >> > thanks
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message