kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Melville <amelvi...@g.hmc.edu>
Subject Re: Can Mirroring Preserve Every Topic's Partition?
Date Fri, 05 Dec 2014 21:23:33 GMT
Thank you for your replies Guozhang and Neha, though I have some followup

I wrote my own Java Consumer and Producer based off of the Kafka Producer
API and High Level Consumer. Let's call them MyConsumer and MyProducer.
MyProducer uses a custom Partitioner class called SimplePartitioner. In the
producer.config file that I specify when I run the MirrorMaker from the
command line, there is a parameter "partitioner.class". I keep getting
"ClassDefNotFoundException exceptions, no matter if I put the absolute path
to my SimplePartitioner.class file, a relative path, or even when I add
SimplePartitioner.class to the $CLASSPATH variables created in the
kafka-run-class.sh script. Here is my output error:

Exception in thread "main" java.lang.ClassNotFoundException:
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at kafka.utils.Utils$.createObject(Utils.scala:438)
at kafka.producer.Producer.<init>(Producer.scala:60)
at kafka.tools.MirrorMaker$$anonfun$1.apply(MirrorMaker.scala:116)
at kafka.tools.MirrorMaker$$anonfun$1.apply(MirrorMaker.scala:106)
at scala.collection.immutable.Range.foreach(Range.scala:81)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:233)
at scala.collection.immutable.Range.map(Range.scala:46)
at kafka.tools.MirrorMaker$.main(MirrorMaker.scala:106)
at kafka.tools.MirrorMaker.main(MirrorMaker.scala)

What is the correct value for the "partitioner.class" parameter in my
producer.properties config file?

Guozhang, in your reply to my original message you said "When the consumer
of the MM gets a message, put the message to the producer's queue...". This
seems to imply that I can specify my own custom Consumer and Producer when
I run the Mirrormaker. How can I do this? Or, if I'm understand incorrectly
and I have to use whichever default consumer/producer the Mirrormaker uses,
how can I get that consumer to learn which partition it's reading from,
pass that info to the producer, and then specify that partition ID when the
producer rights to the target cluster?


On Wed, Nov 26, 2014 at 10:33 AM, Guozhang Wang <wangguoz@gmail.com> wrote:

> Hello Alex,
> This can be done by doing some tweaks in the MM code (with the 0.8.2 new
> producer).
> 1. Set-up your MM to have the total # of producers equal to the #. of
> partitions in source / target cluster.
> 2. When the consumer of the MM gets a message, put the message to the
> producer's queue based on its partition id; i.e. if the partition id is n,
> put to n's producer queue.
> 3. When producer sends the data, specify the partition id; so each producer
> will only send to a single partition.
> Guozhang
> On Tue, Nov 25, 2014 at 8:19 PM, Alex Melville <amelville@g.hmc.edu>
> wrote:
> > Howdy friends,
> >
> >
> > I'd like to mirror the topics on several clusters to a central cluster,
> and
> > I'm looking at using the default Mirrormaker to do so. I've already done
> > some basic testing on the Mirrormaker found here:
> >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330
> >
> > and managed to successfully copy a topic's partitions on a source cluster
> > to a topic on a target cluster. So I'm able to mirror correctly. However
> > for my particular use case I need to ensure that when I copy a topic's
> > partitions from source cluster to target cluster, a partition created on
> > the target cluster contains data in the exact same order as the data on
> the
> > corresponding partition on the source cluster.
> >
> > I'm thinking of writing a Simple Consumer so I can manually compare the
> > events in a source cluster's partition with the corresponding partition
> on
> > the target cluster, but I'm not 100% sure if I'll be able to verify my
> > guarantee if I do it this way. Can anyone here verify that partitions
> > copied over to the target cluster by the default Mirrormaker are an exact
> > copy of those on the source cluster?
> >
> >
> > Thanks in advance,
> >
> > Alex Melville
> >
> --
> -- Guozhang

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message