kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali Akhtar <ali.rac...@gmail.com>
Subject Re: Tracking when a batch of messages has arrived?
Date Sat, 03 Dec 2016 00:01:23 GMT
Hey Apurva,

I am including the batch_id inside the messages.

Could you give me an example of what you mean by custom control messages
with a control topic please?



On Sat, Dec 3, 2016 at 12:35 AM, Apurva Mehta <apurva@confluent.io> wrote:

> That should work, though it sounds like you may be interested in :
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 98+-+Exactly+Once+Delivery+and+Transactional+Messaging
>
> If you can include the 'batch_id' inside your messages, and define custom
> control messages with a control topic, then you would not need one topic
> per batch, and you would be very close to the essence of the above
> proposal.
>
> Thanks,
> Apurva
>
> On Fri, Dec 2, 2016 at 5:02 AM, Ali Akhtar <ali.rac200@gmail.com> wrote:
>
> > Heya,
> >
> > I need to send a group of messages, which are all related, and then
> process
> > those messages, only when all of them have arrived.
> >
> > Here is how I'm planning to do this. Is this the right way, and can any
> > improvements be made to this?
> >
> > 1) Send a message to a topic called batch_start, with a batch id (which
> > will be a UUID)
> >
> > 2) Post the messages to a topic called batch_msgs_<batch_id>. Here
> batch_id
> > will be the batch id sent in batch_start.
> >
> > The number of messages sent will be recorded by the producer.
> >
> > 3) Send a message to batch_end with the batch id and the number of sent
> > messages.
> >
> > 4) On the consumer side, using Kafka Streaming, I would listen to
> > batch_end.
> >
> > 5) When the message there arrives, I will start another instance of Kafka
> > Streaming, which will process the messages in batch_msgs_<batch_id>
> >
> > 6) Perhaps to be extra safe, whenever batch_end arrives, I will start a
> > throwaway consumer which will just count the number of messages in
> > batch_msgs_<batch_id>. If these don't match the # of messages specified
> in
> > the batch_end message, then it will assume that the batch hasn't yet
> > finished arriving, and it will wait for some time before retrying. Once
> the
> > correct # of messages have arrived, THEN it will trigger step 5 above.
> >
> > Will the above method work, or should I make any changes to it?
> >
> > Is step 6 necessary?
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message