samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Job-Selina Wu <>
Subject How should Samza be run on AWS?
Date Wed, 05 Aug 2015 00:58:45 GMT
Dear All:    I was looking for the tutorial how to build and run Samza on
AWS and then I found a link below. I am wondering if there is a detail
tutorial about how to build Samza on AWS?

How should Samza be run on AWS?

>From Gian Merlino:

   - We've been using Samza in production on AWS for a little over a
month. We're
   just using the YARN runner on a mostly stock hadoop 2.4.0 cluster (not
   EMR). Our experience is that c3s work well for the YARN instances and i2s
   work well for the Kafka instances. Things have been pretty solid with that
   setup. For scaling up and scaling down YARN, we just terminate instances
   or add instances, and this works pretty well. It can take a few minutes
   for the cluster to realize a node has gone and respawn containers
   elsewhere. We have a separate Kafka cluster just for Samza's use,
   different from our main Kafka cluster. The main reason is that we wanted
   to isolate off the disk and network load of state compactions and
   restores (we don't use compacted topics in our main Kafka cluster, but
   we do use them with Samza, and the extra load on Kafka can be

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message