kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <matth...@confluent.io>
Subject Re: Does MirrorMaker ensures exactly-once delivery across clusters?
Date Fri, 12 Jan 2018 03:00:01 GMT
From a transaction point of view yes.

However, the MirrorMake consumer must know to read its offsets from the
target cluster instead of the source cluster, and this is quite
unnatural for a consumer... So it's a little bit trickier than just
picky backing commits on the producer...


-Matthias

On 1/11/18 6:45 PM, Stephane Maarek wrote:
> One could refactor MirrorMaker to commit the source cluster's offset in the target cluster's
instead (in a special topic) 
> This would technically allow achieving exactly once using the Transactional API.  
> 
> But there's work associated with that  
> Let me know if I’m missing something
> 
> On 12/1/18, 6:15 am, "Matthias J. Sax" <matthias@confluent.io> wrote:
> 
>     No.
>     
>     Transactions are designed to work within a single cluster, not cross
>     cluster, ie, if you have a read-process-write pattern similar to what
>     Kafka Streams does.
>     
>     -Matthias
>     
>     On 1/11/18 12:46 AM, Jiri Humpolicek wrote:
>     > Hi Everyone, 
>     > 
>     > since kafka 0.11.x supports exactly-once semantics, I want to be sure, that

>     > it is possible to achieve exactly-once delivery across kafka clusters using

>     > MirrorMaker. 
>     > 
>     > We have got two locations with "primary" cluster in each location and for 
>     > each location we have got one "aggregation" cluster which mirrors data from

>     > all primary clusters. 
>     > 
>     > Currently we deduplicate messages after copying data from aggregation kafka

>     > to HDFS by separete YARN application. But in aggregation kafka duplicates 
>     > remains. So I want to ensure that there are no duplicates and data loss in 
>     > kafka as well. In this case our deduplication yarn application could not be

>     > use anymore. 
>     > 
>     > If it is possible, how to configure MirrorMaker to achieve exactly-once 
>     > delivery across primary and aggregation clusters? 
>     > 
>     > 
>     > Thanks and have a nice day, Jiri Humpolicek 
>     > 
>     
>     
> 
> 


Mime
View raw message