kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <matth...@confluent.io>
Subject Re: Kafka Streams - Parallel by default or 1 thread per topic?
Date Tue, 04 Oct 2016 21:19:05 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Yes.

On 10/4/16 1:47 PM, Ali Akhtar wrote:
> Hey Matthias,
> 
> All my topics have 3 partitions each, and I will have about 20-30
> topics in total that need to be subscribed to and managed.
> 
> So, if I create an app which registers handles for each of the 30
> topics, the parallelization / multithreading will be handled behind
> the scenes by kafka streaming, correct?
> 
> If I deployed 2 more instances of the same app, to have 3 instances
> of my app,  will the load get redistributed automatically so that
> instead of the same app listening to all 3 partitions for each
> topic, this gets spread around so now each instances of the app
> will listen to 1 partition of each topic each?
> 
> (Each instance of the app will be using the same consumer group
> name)
> 
> On Wed, Oct 5, 2016 at 1:43 AM, Matthias J. Sax
> <matthias@confluent.io> wrote:
> 
> Kafka Stream parallelizes via Kafka partitions -- for each
> partitions a task is created. If you subscribe to multiple topics,
> the topics with the most partitions determine the number of task,
> and each task get partitions from all topics assigned.
> 
> Furthermore, you can configure the number to thread 
> (num.stream.threads     , see 
> http://docs.confluent.io/current/streams/developer-guide.html#optional
- -c
>
> 
onfiguration-parameters)
> -- the max useful configuration is the number of created tasks.
> Keep in mind, if you start multiple instanced of you Streams app, 
> partitions are managed in a consumer group fashion, ie, are 
> distributed over the running instances.
> 
> Please see here for more details 
> http://docs.confluent.io/current/streams/architecture.html#parallelism
- -m
>
> 
odel
> 
> 
> -Matthias
> 
> On 10/4/16 1:27 PM, Ali Akhtar wrote:
>>>> I need to consume a large number of topics, and handle each
>>>> topic in a different way.
>>>> 
>>>> I was thinking about creating a different KStream for each
>>>> topic, and doing KStream.foreach for each stream, to process
>>>> incoming messages.
>>>> 
>>>> However, its unclear if this will be handled in a parallel
>>>> way by default, or if I need to create a managed ThreadPool
>>>> and create the KStream for each topic within its own thread
>>>> pool.
>>>> 
>>>> Can anyone shed some light - does KStreamBuilder / KStream
>>>> handle concurrency for each KStream, or does this need to be
>>>> managed?
>>>> 
>>>> Thanks.
>>>> 
>> 
> 
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJX9BzJAAoJECnhiMLycopPjlQP/j8wVrA8zMMAZESTNbHEOlvT
tQp/l00MW5XWWVWiR/i/BUeRdmUNGRWzKikchdZGzyxsCYxvprP6k8/JKHEr/mD7
VyoS6/dZLL7Z51cP0wVyWUIwXU9BRr5VvzUZlXthFFA6F7gY7azS3LqRpx+aZNZD
IOBpPJtpvSXiFIBbOqrfHtHy62WRd9C9koSg2wfyGjPxH0J9qFO+/Jq5VrGk0HH2
FXkYtT8PVm60RuKkMa1DrAK148iPJDfLO/GADAgfBejnV6PP/csh9JNIwi6ZcLLe
xko/WKhLPD0SPnnqPFpf6Sqguv38y/6fUFJw44MNs4z5A5c0KG7lqZ9FELs9CiUO
+mJL97WYy4yhxYiTI0E4Cbr9BsPwJ5CRNPrhu0euHvQ08O1WjIJoFvgYCNo+dKnS
oq4A8IYV31+U8hQASQGSe4ejq6og/55EigKMa+VkqoGW8vSqEofL4AXZEq1tR+Ya
+nql7aOSFef9/n2JRghOaZB12QW/oXz2nX+yzj8fhoyZMIQGuzQGQebYq2I9Zkg6
/+QsJKHUkCGzh37k66LusEL/HgMEhIs4nhlxhZ3rWybrkkv5Oi7jO1o8ox8k9Cvh
Ahp/YGPsD8O1L/9x4IaA4l/U2MY3/wVQpyFCRQQEL49FmS66wdIGlnodZwO4SrdS
DO4SLxIyi70WtIXV8qVK
=y2zf
-----END PGP SIGNATURE-----

Mime
View raw message