spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <>
Subject Re: kafka and zookeeper set up in prod for spark streaming
Date Fri, 03 Mar 2017 08:29:31 GMT
I think this highly depends on the risk that you want to be exposed to. If you have it on dedicated
nodes there is less influence of other processes.

I have seen both: on Hadoop nodes or dedicated. On Hadoop I would not recommend to put it
on data nodes/heavily utilized nodes.

Zookeeper does not need many resources (if you do not abuse it) and you may think about putting
it on a dedicated small infrastructure of several nodes.

> On 3 Mar 2017, at 09:15, Mich Talebzadeh <> wrote:
> hi,
> In DEV, Kafka and ZooKeeper services can be co- located.on the same physical hosts
> In Prod moving forward do we need to set up Zookeeper on its own cluster not sharing
with Hadoop cluster? Can these services be shared within the Hadoop cluster?
> How best to set up Zookeeper that is needed for Kafka for use with Spark Streaming?
> Thanks
> Dr Mich Talebzadeh
> LinkedIn
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage
or destruction of data or any other property which may arise from relying on this email's
technical content is explicitly disclaimed. The author will in no case be liable for any monetary
damages arising from such loss, damage or destruction.

View raw message