samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yi Pan <nickpa...@gmail.com>
Subject Re: Does Samza create partitions automatically when sending messages?
Date Wed, 04 Nov 2015 04:40:02 GMT
Hi, John,

Unfortunately, Samza currently does not handle creation of output topic w/
user specified partitions. The construction of OutgoingMessageEnvelope only
guarantees that the outgoing messages are partitioned by the specific key.
Hence, the messages with the same partition key are always in the same
partition. However, Samza does not govern how *many* partitions the output
topic will have. That often requires users to pre-plan and configure the
output Kafka topic.

-Yi

On Tue, Nov 3, 2015 at 8:07 AM, John Tipper <john_tipper@hotmail.com> wrote:

> If you use Samza's OutgoingMessageEnvelope<
> https://samza.apache.org/learn/documentation/0.9/api/javadocs/org/apache/samza/system/OutgoingMessageEnvelope.html>
> to send a message using this format:
>
> public OutgoingMessageEnvelope(SystemStream systemStream,
>                                java.lang.Object partitionKey,
>                                java.lang.Object key,
>                                java.lang.Object message)
> Constructs a new OutgoingMessageEnvelope from specified components.
> Parameters:
> systemStream - Object representing the appropriate stream of which this
> envelope will be sent on.
> partitionKey - A key representing which partition of the systemStream to
> send this envelope on.
> key - A deserialized key to be used for the message.
> message - A deserialized message to be sent in this envelope.
>
>
> and you call this method within a stream task's process() method and want
> to route the incoming messages to an appropriate partition, will Samza
> create the partitions for you when you call the method?
>
> E.g.
>
> MessageA = {"id": "idA", "key": "keyA", "body":"some details"}
> MessageB = {"id": "idB", "key": "keyB", "body":"some more details"}
>
>
> If I call within a stream task's process() where msg is a message instance:
>
> public void process(IncomingMessageEnvelope envelope, MessageCollector
> collector, TaskCoordinator coordinator) {
>     // ...
>     String partition = msg["id"]
>     String key = msg["key"]
>     collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka",
> "PartitionedMessages"), id, key, msg));
>     // ...
>
>
> Will this create partitions idA and idB automatically for me (i.e. do I
> need to have created these partitions before I send message to them)?
>
> I want to be able to route a message to an appropriate partition and also
> to be able to log compaction with a separate message key.  I do not know in
> advance how many partitions I will need - is this compatible with the way
> Samza works?
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message