kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiri Humpolicek" <Jiri.Humpoli...@seznam.cz>
Subject Does MirrorMaker ensures exactly-once delivery across clusters?
Date Thu, 11 Jan 2018 08:46:12 GMT
Hi Everyone, 

since kafka 0.11.x supports exactly-once semantics, I want to be sure, that 
it is possible to achieve exactly-once delivery across kafka clusters using 
MirrorMaker. 

We have got two locations with "primary" cluster in each location and for 
each location we have got one "aggregation" cluster which mirrors data from 
all primary clusters. 

Currently we deduplicate messages after copying data from aggregation kafka 
to HDFS by separete YARN application. But in aggregation kafka duplicates 
remains. So I want to ensure that there are no duplicates and data loss in 
kafka as well. In this case our deduplication yarn application could not be 
use anymore. 

If it is possible, how to configure MirrorMaker to achieve exactly-once 
delivery across primary and aggregation clusters? 


Thanks and have a nice day, Jiri Humpolicek 
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message