kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali Akhtar <ali.rac...@gmail.com>
Subject Re: Kafka Streams - Parallel by default or 1 thread per topic?
Date Tue, 04 Oct 2016 21:31:11 GMT
That's awesome. Thanks.

On Wed, Oct 5, 2016 at 2:19 AM, Matthias J. Sax <matthias@confluent.io>
wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> Yes.
>
> On 10/4/16 1:47 PM, Ali Akhtar wrote:
> > Hey Matthias,
> >
> > All my topics have 3 partitions each, and I will have about 20-30
> > topics in total that need to be subscribed to and managed.
> >
> > So, if I create an app which registers handles for each of the 30
> > topics, the parallelization / multithreading will be handled behind
> > the scenes by kafka streaming, correct?
> >
> > If I deployed 2 more instances of the same app, to have 3 instances
> > of my app,  will the load get redistributed automatically so that
> > instead of the same app listening to all 3 partitions for each
> > topic, this gets spread around so now each instances of the app
> > will listen to 1 partition of each topic each?
> >
> > (Each instance of the app will be using the same consumer group
> > name)
> >
> > On Wed, Oct 5, 2016 at 1:43 AM, Matthias J. Sax
> > <matthias@confluent.io> wrote:
> >
> > Kafka Stream parallelizes via Kafka partitions -- for each
> > partitions a task is created. If you subscribe to multiple topics,
> > the topics with the most partitions determine the number of task,
> > and each task get partitions from all topics assigned.
> >
> > Furthermore, you can configure the number to thread
> > (num.stream.threads     , see
> > http://docs.confluent.io/current/streams/developer-guide.html#optional
> - -c
> >
> >
> onfiguration-parameters)
> > -- the max useful configuration is the number of created tasks.
> > Keep in mind, if you start multiple instanced of you Streams app,
> > partitions are managed in a consumer group fashion, ie, are
> > distributed over the running instances.
> >
> > Please see here for more details
> > http://docs.confluent.io/current/streams/architecture.html#parallelism
> - -m
> >
> >
> odel
> >
> >
> > -Matthias
> >
> > On 10/4/16 1:27 PM, Ali Akhtar wrote:
> >>>> I need to consume a large number of topics, and handle each
> >>>> topic in a different way.
> >>>>
> >>>> I was thinking about creating a different KStream for each
> >>>> topic, and doing KStream.foreach for each stream, to process
> >>>> incoming messages.
> >>>>
> >>>> However, its unclear if this will be handled in a parallel
> >>>> way by default, or if I need to create a managed ThreadPool
> >>>> and create the KStream for each topic within its own thread
> >>>> pool.
> >>>>
> >>>> Can anyone shed some light - does KStreamBuilder / KStream
> >>>> handle concurrency for each KStream, or does this need to be
> >>>> managed?
> >>>>
> >>>> Thanks.
> >>>>
> >>
> >
> -----BEGIN PGP SIGNATURE-----
> Comment: GPGTools - https://gpgtools.org
>
> iQIcBAEBCgAGBQJX9BzJAAoJECnhiMLycopPjlQP/j8wVrA8zMMAZESTNbHEOlvT
> tQp/l00MW5XWWVWiR/i/BUeRdmUNGRWzKikchdZGzyxsCYxvprP6k8/JKHEr/mD7
> VyoS6/dZLL7Z51cP0wVyWUIwXU9BRr5VvzUZlXthFFA6F7gY7azS3LqRpx+aZNZD
> IOBpPJtpvSXiFIBbOqrfHtHy62WRd9C9koSg2wfyGjPxH0J9qFO+/Jq5VrGk0HH2
> FXkYtT8PVm60RuKkMa1DrAK148iPJDfLO/GADAgfBejnV6PP/csh9JNIwi6ZcLLe
> xko/WKhLPD0SPnnqPFpf6Sqguv38y/6fUFJw44MNs4z5A5c0KG7lqZ9FELs9CiUO
> +mJL97WYy4yhxYiTI0E4Cbr9BsPwJ5CRNPrhu0euHvQ08O1WjIJoFvgYCNo+dKnS
> oq4A8IYV31+U8hQASQGSe4ejq6og/55EigKMa+VkqoGW8vSqEofL4AXZEq1tR+Ya
> +nql7aOSFef9/n2JRghOaZB12QW/oXz2nX+yzj8fhoyZMIQGuzQGQebYq2I9Zkg6
> /+QsJKHUkCGzh37k66LusEL/HgMEhIs4nhlxhZ3rWybrkkv5Oi7jO1o8ox8k9Cvh
> Ahp/YGPsD8O1L/9x4IaA4l/U2MY3/wVQpyFCRQQEL49FmS66wdIGlnodZwO4SrdS
> DO4SLxIyi70WtIXV8qVK
> =y2zf
> -----END PGP SIGNATURE-----
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message