kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jiangjie Qin <j...@linkedin.com.INVALID>
Subject Re: New Producer API - batched sync mode support
Date Fri, 01 May 2015 04:26:19 GMT
Roshan,

If I understand correctly, you just want to make sure a number of messages
has been sent successfully. Using callback might be easier to do so.

Public class MyCallback implements Callback {
	public Set<RecordMetadata> failedSend;
	@Override
	Public void onCompletion(RecordMetadata metadata, Exception exception) {
		If (exception != null)
			failedSend.add(metadata);
	}
	
	Public boolean hasFailure() {return failedSend.size() > 0);
}

In main code, you just need to do the following:
{
MyCallback callback = new MyCallback();
For (ProducerRecord record: records)
	Producer.send();

Producer.flush();
If (callback.hasFailure())
	// do something
}

This will avoid the loop checking and provide you pretty much the same
guarantee as old producer if not better.

Jiangjie (Becket) Qin
	

On 4/30/15, 4:54 PM, "Roshan Naik" <roshan@hortonworks.com> wrote:

>@Gwen, @Ewen,
>  While atomicity of a batch is nice to have, it is not essential. I don't
>think users always expect such atomicity. Atomicity is not even guaranteed
>in many un-batched systems let alone batched systems.
>
>As long as the client gets informed about the ones that failed in the
>batch.. that would suffice.
>
>One issue with the current flush() based batch-sync implementation is that
>the client needs to iterate over *all* futures in order to scan for any
>failed messages. In the common case, it is just wasted CPU cycles as there
>won't be any failures. Would be ideal if the client is informed about only
>problematic messages.
>
>  IMO, adding a new send(batch) API may be meaningful if it can provide
>benefits beyond what user can do with a simple wrapper on existing stuff.
>For example: eliminate the CPU cycles wasted on examining results from
>successful message deliveries, or other efficiencies.
>
>
>
>@Ivan,
>   I am not certain, I am thinking that there is a possibility that the
>first few messages of the batch got accepted, but not the remainder ? At
>the same time based on some comments made earlier it appears underlying
>implementation does have an all-or-none mechanism for a batch going to a
>partition.
>For simplicity, streaming clients may not want to deal explicitly with
>partitions (and get exposed to repartitioning & leader change type issues)
>
>-roshan
>
>
>
>On 4/30/15 2:07 PM, "Gwen Shapira" <gshapira@cloudera.com> wrote:
>
>>Why do we think atomicity is expected, if the old API we are emulating
>>here
>>lacks atomicity?
>>
>>I don't remember emails to the mailing list saying: "I expected this
>>batch
>>to be atomic, but instead I got duplicates when retrying after a failed
>>batch send".
>>Maybe atomicity isn't as strong requirement as we believe? That is,
>>everyone expects some duplicates during failure events and handles them
>>downstream?
>>
>>
>>
>>On Thu, Apr 30, 2015 at 2:02 PM, Ivan Balashov <ibalashov@gmail.com>
>>wrote:
>>
>>> 2015-04-30 8:50 GMT+03:00 Ewen Cheslack-Postava <ewen@confluent.io>:
>>>
>>> > They aren't going to get this anyway (as Jay pointed out) given the
>>> current
>>> > broker implementation
>>> >
>>>
>>> Is it also incorrect to assume atomicity even if all messages in the
>>>batch
>>> go to the same partition?
>>>
>


Mime
View raw message