From Blake Matheny <bl...@tumblr.com>
Subject Partition Question
Date Sun, 14 Aug 2011 02:39:13 GMT
Our current setup:

 2 brokers, each with num.partitions set to 5
 n producers, publishing to 5 topics
 5 consumers
   All in same consumer group
   Each is consuming from all 5 topics
   Each is reading from 2 KafkaMessageStream's
 Custom Partitioner, provides uniform distribution

Having read the recently recommended Kafka paper that describes some
of the partitioning semantics, I have a few questions wrt the above

First, the way the ZK info for the brokers read, it looks like setting
num.partitions to 5 on each broker has actually created 10 total
partitions, 5 on each broker, is that correct?
Second, with 5 topics, 5 partitions, and 2 brokers, does that give you
50 distinct message streams? I understand that a consumer can pull
from more than one partition, but assuming you would like to map a
single topic/partition to each consumer, would you in the above setup
want to run 50 consumers?
Lastly, I'm seeing updates to the log files on the second broker
(/tmp/kafka-logs/[topic]-[partition-id]/[logfile].kafka is growing),
but the corresponding offset znode isn't being updated by the
consumer. The same consumer is updating the offset for the same topic,
different partiton/consumer just fine (which leads me to believe the
consumer is working properly). Is there something in the above
described config that sounds incorrect? I'm wondering if there is a
bug (in my code or elsewhere) when a consumer is reading from two
partitions on the same topic across more than one broker. Just
guessing though.

Thanks in advance,


Blake Matheny

