samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Angelica Garcia-Gutierrez <>
Subject Samza + MongoDb Sharding
Date Thu, 06 Jul 2017 19:21:28 GMT

I have a Samza job that currently makes remote calls to a MongoDB to get additional information
about the input stream. For scalability, MongoDB was initially partitioned into 4 shards (more
shards will be added as needed).
The questions are:

  *   Does it make sense to attempt to partition the input stream into multiple partitions
such that a given task can consume it and expand the message with information retrieved from
a specific MongoDB shard?

Can someone please shed some lights?


The information transmitted is intended only for the person or entity to which it is addressed
and may contain CONFIDENTIAL material.  If you receive this material/information in error,
please contact the sender and delete or destroy the material/information.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message