kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: Partitions and Streams
Date Wed, 22 Feb 2012 16:38:55 GMT
Milind,

The degree of parallelism for consumption is determined by total #
partitions in a topic. So, if you have a total of 2 partitions, you can
have at most 2 consumer threads in parallel. If have more than 2 consumer
threads created, some threads will never get any data. In your case, you
can have 2 consumers, but each using a count of 1 in topicCountMap.

Thanks,

Jun

On Wed, Feb 22, 2012 at 12:06 AM, Milind Parikh <milindparikh@gmail.com>wrote:

> I have a situation where one consumer cannot consume the data fast enough
> from the producer.
>
> So in the broker, I create two partitions for the topic. I then create two
> consumers in two seperate jvms. Both consumers have topicCountMap = 2 and
> partition 0 for consumer1 and partition 1 for consumer2. Both are running
> before the producer starts.
>
> Now when I run the default producer (in java example), I can see that one
> of the consumer doesn't get all of the messages (which it shouldn't because
> it should get roughly half). Now the second consumer is seemingly stuck
> until I do CTRL^C on the first consumer. Then the pentup flush happens on
> the second consumer and then the second consumer gets all the data that the
> first one did not get.
>
> public void run() {
>    Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
> *    topicCountMap.put(topic, new Integer(2));
> *    Map<String, List<KafkaMessageStream<Message>>> consumerMap =
> consumer.createMessageStreams(topicCountMap);
>    KafkaMessageStream<Message> stream =  consumerMap.get(topic).get(*
> partition*);
>    ConsumerIterator<Message> it = stream.iterator();
>    while(it.hasNext())
>      System.out.println(ExampleUtils.getMessage(it.next()));
>  }
>
>
> Shouldn't the consuming happen in parallel without needing to due a CRTL^C
> (or an equivalent of a timeout) ? In other words, how can I get it to
> parallel process. Is there something to do with
> https://issues.apache.org/jira/browse/KAFKA-243?
>
>
> Regards
> Milind
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message