kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gwen Shapira <gshap...@cloudera.com>
Subject Re: Doubts Kafka
Date Sun, 08 Feb 2015 14:25:54 GMT
Hi Eduardo,

1. "Why sometimes the applications prefer to connect to zookeeper instead

I assume you are talking about the clients and some of our tools?
These are parts of an older design and we are actively working on fixing
this. The consumer used Zookeeper to store offsets, in 0.8.2 there's an
option to use Kafka itself for that (by setting *offsets.storage = kafka*).
We are planning on fixing the tools in 0.9, but obviously they are less
performance sensitive than the consumers.

2. Regarding your tests and disk usage - I'm not sure exactly what fills
your disk - if its the kafka transaction logs (i.e. log.dir), then we
expect to store the size of all messages sent times the replication faction
configured for each topic. We keep messages for the amount of time
specified in *log.retention* parameters. If the disk is filled within
minutes, either set log.retention.minutes very low (at risk of losing data
if consumers need restart), or make sure your disk capacity matches the
rates in which producers send data.


On Sat, Feb 7, 2015 at 3:01 AM, Eduardo Costa Alfaia <e.costaalfaia@unibs.it
> wrote:

> Hi Guys,
> I have some doubts about the Kafka, the first is Why sometimes the
> applications prefer to connect to zookeeper instead brokers? Connecting to
> zookeeper could create an overhead, because we are inserting other element
> between producer and consumer. Another question is about the information
> sent by producer, in my tests the producer send the messages to brokers and
> a few minutes my HardDisk is full (my harddisk has 250GB), is there
> something to do in the configuration to minimize this?
> Thanks
> --
> Informativa sulla Privacy: http://www.unibs.it/node/8155

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message