kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neha Narkhede <neha.narkh...@gmail.com>
Subject Re: Keeping zookeeper happy?
Date Wed, 08 Aug 2012 17:36:41 GMT
I've added the above recommendations along with others to our
Operations wiki -


On Tue, Aug 7, 2012 at 4:14 PM, Joel Koshy <jjkoshy.w@gmail.com> wrote:
> Here are some comments from Dave. He'll be adding some more details to a
> blog post that we can link from that wiki.
> For the most part, yes, it is pretty much the obvious, but here's the
> short version of some longer details that I really should get into the
> wiki that you pointed at:
> - Redundancy in the physical/hardware/network layout: try not to put
>   them all in the same rack, decent (but don't go nuts) hardware, try to
>   keep redundant power and network paths, etc
> - I/O segregation: if you do a lot of write type traffic you'll almost
>   definitely want the transaction logs on a different disk group than
>   app logs and snapshots (the write to the zookeeper service has a
>   synchronous write to disk, which can be slow).
> - Application segregation: Unless you really understand the application
>   patterns of other apps that you want to install on the same box, it
>   can be a good idea to run zookeeper in isolation (though this can be a
>   balancing act with the capabilities of the hardware).
> - Use care with virtualization: It can work, depending on your cluster
>   layout and read/write patterns and SLAs, but the tiny overheads
>   introduced by the virtualization layer can add up and throw off
>   zookeeper, as it can be very time sensitive
> - Zookeeper configuration and monitoring: It's java, make sure you give
>   it 'enough' heap space (I usually run them with 3-5G, but that's
>   mostly due to the data set size we have here).  Unfortunately I don't
>   have a good formula for it. As far as monitoring, both JMZ and the 4
>   letter commands are very useful, they do overlap in some cases (and in
>   those cases I prefer the 4 letter commands, they seem more
>   predictable, or at the very least, they work better with the LI
>   monitoring infrastructure)
> - Don't overbuild the cluster: large clusters, especially in a write
>   heavy usage pattern, means a lot of intra cluster communication
>   (quorums on the writes and subsequent cluster member updates), but
>   don't underbuild it (and risk swamping the cluster).
> Overall, I try to keep the zookeeper system as small as will handle the
> load (plus standard growth capacity planning) and as simple as possible.
> I try not to do anything fancy with the configuration or application
> layout as compared to the official release as well as keep it as self
> contained as possible.  For these reasons, I tend to skip the OS
> packaged versions, since it has a tendency to try to put things in the
> OS standard hierarchy, which can be 'messy', for want of a better way to
> word it.
> On Tue, Aug 7, 2012 at 12:00 PM, James A. Robinson <
> jim.robinson@stanford.edu> wrote:
>> Hi folks,
>> The operations wiki page
>>   https://cwiki.apache.org/confluence/display/KAFKA/Operations
>> states, in part
>>   Zookeeper
>>   Zookeeper is essential for the correct operation of Kafka. There are
>>   a number of things that must be done to keep zookeeper running
>>   happily as we have learned the hard way, hopefully Dave and Neha
>>   will add this since I don't know what we did.
>> I was wondering if anyone on the list had comments on this topic?
>> Were there things beyond what might be considered obvious, e.g.,
>> running at least five nodes on separate machines w/ redundant network
>> paths, and so forth?
>> Jim
>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> James A. Robinson                       jim.robinson@stanford.edu
>> Stanford University HighWire Press      http://highwire.stanford.edu/
>> +1 650 7237294 (Work)                   +1 650 7259335 (Fax)

View raw message