kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damian Guy <damian....@gmail.com>
Subject Re: [Streams] Threading Frustration
Date Mon, 12 Dec 2016 16:42:10 GMT
Hi Avi,

Thanks for sharing your code. I believe the reason you are only seeing 20
threads used is due to a couple of things:

KafkaStreams uses a concept of topic groups to group related topics into
StreamTasks. A StreamTask is a task an independent part of the topology
that can be executed on its own - they define the degree of parallelism you
can achieve in your application. As you can see here:
https://gist.github.com/aviflax/f7332af810f8281549652bc4f9a5e007#file-log_fragment-log-L366
there
are only 20 active tasks that are created. Now when you create your
KafkaStreams app I can see that you are using the overladed method that
takes an array of topic names to build the streams. This will mean that
there is only a single topic group for your application. In this case the
maximum number of StreamTasks, and hence the maximum degree of parallelism
you can achieve, is defined by the maximum partition count of the input
topics. So, i'm guessing the max partitions you have for an input topic is
20.

If you want to split these out so that they can run in parallel, then you
will need to create a new stream for each topic.

HTH,
Damian

On Mon, 12 Dec 2016 at 16:03 Avi Flax <avi.flax@parkassist.com> wrote:

>
> > On Dec 12, 2016, at 10:24, Damian Guy <damian.guy@gmail.com> wrote:
> >
> > The code for your Streams Application. Doesn't have to be the actual
> code,
> > but an example of how you are using Kafka Streams.
>
> OK, I’ve prepared a gist with (I hope) the relevant code, and also some
> log records just in case they might help:
>
> https://gist.github.com/aviflax/f7332af810f8281549652bc4f9a5e007
>
> Thanks for the help!

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message