kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: kafka server configuration questions
Date Wed, 17 Aug 2011 13:33:34 GMT
Ismail,

Most of what you described are reasonable. A few comments:
1. If you use 2 consumers in the same group, each of them will only get
about half of the data from the brokers. So, in 1), if you want to process
all data, the second consumer has to process messages too.

2. Typically, you can overlay ZK server on Kafka brokers. However, you need
at least 3 ZK servers.

3. The minimal number of partitions (in total) is the number of consumer
threads (in total).

Jun

On Wed, Aug 17, 2011 at 4:52 AM, Ismail Dev <develop16@googlemail.com>wrote:

> Hi all,
>
> we are working a project which should collect traces, journal and audit
> entries
> produced by an application running on a tomcat server in a central
> data-store.
> We are expecting about one million entries per day, about 150 MB data.
>
> 1.)
> The trace entries must be collected in the same order like produced from
> the
>
> application and we need a failover mechanism.
> The aimed configuration trace collection would be:
> - exact one producer on tomcat server creating trace entries
> - the producer sends the messages always to the same partition/broker
> - 2 brokers on different physical servers
> - 2 consumers running on both broker server
> - the consumers belong to the same group (e.g. 'trace')
> - just one consumer is processing the messages and the second one is for
> failover
>
> How should be the zookeper configuration ?
> - one zookeeper server for each brokers running on the server where tomcat
> server runs
> - or 2 clustered zookeeper server each running on the brokers physical
> server
>
> Is it a good idea to run the consumers on the same physical server as the
> brokers ?
>
> Makes this configuration sense ?
>
> 2.)
> For the journal and audit the order of the entries are not important. So
> the
> aimed configuration for these would be:
> - n producers running on the tomcat server
> - the producers send the messages randomly to available brokers
> - at least 2 brokers with m partitions on different physical server
> - at least 2 consumers running on both broker server with m threads
> - the consumers belong to different groups (e.g. 'journal' and 'audit')
>
> My question here is how to figure out the number of partitions. Are there
> any measure values or hints ?
>
> Many thanks,
> Ismail.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message