samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommy Becker <>
Subject Re: sending data to a different partitions of the output stream
Date Tue, 07 Apr 2015 11:46:56 GMT
If you want to send to a specific partition number, you can just pass that number as the partition
key.  This works because the default partitioner is via hashcode, and the hash of integers
is the value itself.

On 04/07/2015 07:40 AM, Vladimir Lebedev wrote:

I can not find clear explanation of this in the documentation or in hello-samza: how to tell
collector to send my output data to a particular partition of the output stream?

My understanding is that in my process() method I have to create OutgoingMessageEnvelope object
passing not only my deserialized data, but also my partition key, like this:

try {
      collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka", "output"), my_partition_key,
null, my_data));
    } catch (Exception e) {
      System.err.println("Unable to parse line: " + event);

The question is: who is responsible for computing the partition number based on my_partition_key?
How, for example, I can establish some kind of consistent hashing mechanism for computing
the partition number based on the key? Is it configurable somehow via task properties, like
I may do it in Kafka via partitioner.class property?

Many thanks in advance,


Vladimir Lebedev

Tommy Becker
Senior Software Engineer

A TiVo Company<><>


This email and any attachments may contain confidential and privileged material for the sole
use of the intended recipient. Any review, copying, or distribution of this email (or any
attachments) by others is prohibited. If you are not the intended recipient, please contact
the sender immediately and permanently delete this email and any attachments. No employee
or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc.
by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message