kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali Akhtar <ali.rac...@gmail.com>
Subject Re: Kafka Streams - Parallel by default or 1 thread per topic?
Date Tue, 04 Oct 2016 22:50:40 GMT
<3

On Wed, Oct 5, 2016 at 2:31 AM, Ali Akhtar <ali.rac200@gmail.com> wrote:

> That's awesome. Thanks.
>
> On Wed, Oct 5, 2016 at 2:19 AM, Matthias J. Sax <matthias@confluent.io>
> wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA512
>>
>> Yes.
>>
>> On 10/4/16 1:47 PM, Ali Akhtar wrote:
>> > Hey Matthias,
>> >
>> > All my topics have 3 partitions each, and I will have about 20-30
>> > topics in total that need to be subscribed to and managed.
>> >
>> > So, if I create an app which registers handles for each of the 30
>> > topics, the parallelization / multithreading will be handled behind
>> > the scenes by kafka streaming, correct?
>> >
>> > If I deployed 2 more instances of the same app, to have 3 instances
>> > of my app,  will the load get redistributed automatically so that
>> > instead of the same app listening to all 3 partitions for each
>> > topic, this gets spread around so now each instances of the app
>> > will listen to 1 partition of each topic each?
>> >
>> > (Each instance of the app will be using the same consumer group
>> > name)
>> >
>> > On Wed, Oct 5, 2016 at 1:43 AM, Matthias J. Sax
>> > <matthias@confluent.io> wrote:
>> >
>> > Kafka Stream parallelizes via Kafka partitions -- for each
>> > partitions a task is created. If you subscribe to multiple topics,
>> > the topics with the most partitions determine the number of task,
>> > and each task get partitions from all topics assigned.
>> >
>> > Furthermore, you can configure the number to thread
>> > (num.stream.threads     , see
>> > http://docs.confluent.io/current/streams/developer-guide.html#optional
>> - -c
>> >
>> >
>> onfiguration-parameters)
>> > -- the max useful configuration is the number of created tasks.
>> > Keep in mind, if you start multiple instanced of you Streams app,
>> > partitions are managed in a consumer group fashion, ie, are
>> > distributed over the running instances.
>> >
>> > Please see here for more details
>> > http://docs.confluent.io/current/streams/architecture.html#parallelism
>> - -m
>> >
>> >
>> odel
>> >
>> >
>> > -Matthias
>> >
>> > On 10/4/16 1:27 PM, Ali Akhtar wrote:
>> >>>> I need to consume a large number of topics, and handle each
>> >>>> topic in a different way.
>> >>>>
>> >>>> I was thinking about creating a different KStream for each
>> >>>> topic, and doing KStream.foreach for each stream, to process
>> >>>> incoming messages.
>> >>>>
>> >>>> However, its unclear if this will be handled in a parallel
>> >>>> way by default, or if I need to create a managed ThreadPool
>> >>>> and create the KStream for each topic within its own thread
>> >>>> pool.
>> >>>>
>> >>>> Can anyone shed some light - does KStreamBuilder / KStream
>> >>>> handle concurrency for each KStream, or does this need to be
>> >>>> managed?
>> >>>>
>> >>>> Thanks.
>> >>>>
>> >>
>> >
>> -----BEGIN PGP SIGNATURE-----
>> Comment: GPGTools - https://gpgtools.org
>>
>> iQIcBAEBCgAGBQJX9BzJAAoJECnhiMLycopPjlQP/j8wVrA8zMMAZESTNbHEOlvT
>> tQp/l00MW5XWWVWiR/i/BUeRdmUNGRWzKikchdZGzyxsCYxvprP6k8/JKHEr/mD7
>> VyoS6/dZLL7Z51cP0wVyWUIwXU9BRr5VvzUZlXthFFA6F7gY7azS3LqRpx+aZNZD
>> IOBpPJtpvSXiFIBbOqrfHtHy62WRd9C9koSg2wfyGjPxH0J9qFO+/Jq5VrGk0HH2
>> FXkYtT8PVm60RuKkMa1DrAK148iPJDfLO/GADAgfBejnV6PP/csh9JNIwi6ZcLLe
>> xko/WKhLPD0SPnnqPFpf6Sqguv38y/6fUFJw44MNs4z5A5c0KG7lqZ9FELs9CiUO
>> +mJL97WYy4yhxYiTI0E4Cbr9BsPwJ5CRNPrhu0euHvQ08O1WjIJoFvgYCNo+dKnS
>> oq4A8IYV31+U8hQASQGSe4ejq6og/55EigKMa+VkqoGW8vSqEofL4AXZEq1tR+Ya
>> +nql7aOSFef9/n2JRghOaZB12QW/oXz2nX+yzj8fhoyZMIQGuzQGQebYq2I9Zkg6
>> /+QsJKHUkCGzh37k66LusEL/HgMEhIs4nhlxhZ3rWybrkkv5Oi7jO1o8ox8k9Cvh
>> Ahp/YGPsD8O1L/9x4IaA4l/U2MY3/wVQpyFCRQQEL49FmS66wdIGlnodZwO4SrdS
>> DO4SLxIyi70WtIXV8qVK
>> =y2zf
>> -----END PGP SIGNATURE-----
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message