spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vincent gromakowski <vincent.gromakow...@gmail.com>
Subject Re: kafka and zookeeper set up in prod for spark streaming
Date Fri, 03 Mar 2017 08:29:21 GMT
Hi,
Depending on the Kafka version (< 0.8.2 I think), offsets are managed in
Zookeeper and if you have lots of consumer it's recommended to use a
dedicated zookeeper cluster (always with dedicated disks, even SSD is
better). On newer version offsets are managed in special Kafka topics and
Zookeeper is only used to store metadata, you can share it with Hadoop.
Maybe you can reach a limit depending on the size of your Kafka, the number
of topics, producers/consumers... but I have never heard yet. Another point
is to be careful about security on Zookeeper, sharing a cluster means you
get the same security level (authentication or not)

2017-03-03 9:15 GMT+01:00 Mich Talebzadeh <mich.talebzadeh@gmail.com>:

>
> hi,
>
> In DEV, Kafka and ZooKeeper services can be co- located.on the same
> physical hosts
>
> In Prod moving forward do we need to set up Zookeeper on its own cluster
> not sharing with Hadoop cluster? Can these services be shared within the
> Hadoop cluster?
>
> How best to set up Zookeeper that is needed for Kafka for use with Spark
> Streaming?
>
> Thanks
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>

Mime
View raw message