From abhijeet kadam <abhijeet.e...@gmail.com>
Subject Kafka Producer load distribution
Date Wed, 05 Mar 2014 20:06:10 GMT
Hi, I am new with kafka and using kafka 0.8 to build a distributed queuing
system in amazon web service cluster.

I have 4 machines Z1, B1, B2 and B3. 1 Zookeeper instance is running on Z1
and 3 different brokers are running on B1,B2 and B3 respectively.

I am running 3 producers on 3 broker machines(B1, B2, B3) , one in each
machine. Similarly 3 consumers  on 3 broker machines, one in each machine.

I created a topic , lets say 'test', with 12 partitions (test-0,test-1 ...
4 partitions in each broker machine.
   B1 - test-0,test-1,test-2,test-3
   B2 - test-4,test-5,test-6,test-7
   B3 - test-8,test-9,test-10,test-11

Zookeeper assigned broker in each machine as a leader to the partitions
present in the same machine.
Partition   -  leader
test-0     -    B1
test-1     -    B1
test-2     -    B1
test-3     -    B1
test-4     -    B2
test-5     -    B2
test-6     -    B2
test-7     -    B2
test-8     -    B3
test-9     -    B3
test-10     -  B3
test-11     -  B3

All 3 producers are producing messages to this topic 'test' and all 3
consumers are trying to consume from the same topic 'test'.

What I am trying to achieve here is , whenever a producer send a message to
this topic , it should use the broker present in the same machine as
producer and ultimately using the partitions in the same machine.
Producer 1 ---> B1 ---->  (test-0,test-1,test-2,test-3) -----> consumer 1
Producer 2 ---> B2 ---->  (test-4,test-5,test-6,test-7) -----> consumer 2
Producer 3 ---> B3 ---->  (test-8,test-9,test-10,test-11) -----> consumer 3

I am assuming this will reduce the inter-machine message transfer and will
improve the performance.

My questions are :

1) Does it really help in improving performance, when message is produced
and consumed from same machine in a distributed environment.

2) I read that producer can fetch metadata from broker about all
leader-partition mapping for a topic. It will help to pick the leader
present in the same machine as producer. How a producer can fetch this
metadata ? Could not find any implementation.

Thanks in advance,

