kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali Akhtar <ali.rac...@gmail.com>
Subject Re: Kafka Streams - Parallel by default or 1 thread per topic?
Date Tue, 04 Oct 2016 20:47:55 GMT
Hey Matthias,

All my topics have 3 partitions each, and I will have about 20-30 topics in
total that need to be subscribed to and managed.

So, if I create an app which registers handles for each of the 30 topics,
the parallelization / multithreading will be handled behind the scenes by
kafka streaming, correct?

If I deployed 2 more instances of the same app, to have 3 instances of my
app,  will the load get redistributed automatically so that instead of the
same app listening to all 3 partitions for each topic, this gets spread
around so now each instances of the app will listen to 1 partition of each
topic each?

(Each instance of the app will be using the same consumer group name)

On Wed, Oct 5, 2016 at 1:43 AM, Matthias J. Sax <matthias@confluent.io>
wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> Kafka Stream parallelizes via Kafka partitions -- for each partitions
> a task is created. If you subscribe to multiple topics, the topics
> with the most partitions determine the number of task, and each task
> get partitions from all topics assigned.
>
> Furthermore, you can configure the number to thread
> (num.stream.threads     , see
> http://docs.confluent.io/current/streams/developer-guide.html#optional-c
> onfiguration-parameters)
> - -- the max useful configuration is the number of created tasks. Keep
> in mind, if you start multiple instanced of you Streams app,
> partitions are managed in a consumer group fashion, ie, are
> distributed over the running instances.
>
> Please see here for more details
> http://docs.confluent.io/current/streams/architecture.html#parallelism-m
> odel
>
>
> - -Matthias
>
> On 10/4/16 1:27 PM, Ali Akhtar wrote:
> > I need to consume a large number of topics, and handle each topic
> > in a different way.
> >
> > I was thinking about creating a different KStream for each topic,
> > and doing KStream.foreach for each stream, to process incoming
> > messages.
> >
> > However, its unclear if this will be handled in a parallel way by
> > default, or if I need to create a managed ThreadPool and create the
> > KStream for each topic within its own thread pool.
> >
> > Can anyone shed some light - does KStreamBuilder / KStream handle
> > concurrency for each KStream, or does this need to be managed?
> >
> > Thanks.
> >
> -----BEGIN PGP SIGNATURE-----
> Comment: GPGTools - https://gpgtools.org
>
> iQIcBAEBCgAGBQJX9BRgAAoJECnhiMLycopP9hEQAKgilK2PEaG3QgohJSi8THbb
> yE/u6vVjSgNpIcZKVULoJ8cWMwh9pHb0LH28B/RNBVujHaBc5WO8YuhpvvlKeBro
> TPCvbE+IHKJI/R9pD6OdUkeN09SOd6iJ/Bbc6N02+3rsCFTLiFzwusPx8pH9Tx59
> RTA6VmWBwkzQt1pCYHIKVYil138jgRh7hjQs/3XYex0vibL3bQBltmZwYnIalcbX
> n7fAr3rKlrwLMvH1LPr5NPiyzp6al4gdXxeqNAFNI0wwb6y7nqbMeywdOh4KEruC
> XT8O63O8ykfpL+wNSldT7lnvsxwL5myEp0ONKPRD5S1URzTVEFNj9dzohwGFV7ZE
> M/1nBu5pxf6BzSBWgi1A30iTUgQo7pP7ManKhRw71kGotD/oLdu2gAL4mHuKiao6
> 6Z/6prVsDouAk4CbuvXmNlmAFgHHZswtza0qZEG0797Xl3ByhOfAcuREzOJ2c+LE
> gI1C7E3iV3FgWuOK5B6VIdu0qjC88r8hD7+Q1ep+iDoZUXKqH9LUB06sGE8EaM5Q
> X67ihVuYo1akG3Hta2JFIHbRuoHLTUnqx3BGMEi8bbZWXfeIY+jl2IwQIqAVaXEo
> soA6fgDJmGS0vrVKwF7ceT3XmGtkG9h7tHjxnr+TruC6vdWMlNrSg0A6mb158RNd
> pvdyd9Qf++uud/UZYQb0
> =VfUR
> -----END PGP SIGNATURE-----
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message