beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Reuven Lax (JIRA)" <>
Subject [jira] [Commented] (BEAM-2451) writeProtos() should allow a user to specify message attributes
Date Sat, 17 Jun 2017 22:35:00 GMT


Reuven Lax commented on BEAM-2451:

The intention of the withTimestampAttribute/withIdAttribute was for the runner to automatically
fill in those attributes. This is done by the Dataflow runner, which has it's own built-in
PubSub implementation (it does not use PubsubUnboundedSink). For other runners, as you mentioned
the ShardFn will calculate the timestamp and attributes. This was not meant as a way to write
custom values, all these methods do is tell the runner (or the sink in the non-Dataflow case)
which attributes to fill in.

If you want to put your own, user-calculated timestamp or id on a message, you shouldn't use
withTimestampAttribute or withIdAttribute. Just create your own PubsubMessage, and write whatever
attributes you want. Turning a proto message into a byte array is as simple as calling tobyteArray()
on the message (and parseFrom when reading it back).

> writeProtos() should allow a user to specify message attributes
> ---------------------------------------------------------------
>                 Key: BEAM-2451
>                 URL:
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>    Affects Versions: 2.0.0
>            Reporter: Keith Berkoben
>            Assignee: Reuven Lax
> when using PubsubIO.writeProtos(protoMessage), the PubsubMessage is created in the background
and there is no way to specify the attributes of the message.  This makes the method useless
if a user wants to then use the withTimestampAttribute() or withIdAttribute() of the writer
with anything other than the default timestamp or ID (a common use case).  
> As a workaround the user can manually create the PubsubMessage, but this basically requires
duplication of the writeProtos() logic, which is brittle in the case that the encoding logic

This message was sent by Atlassian JIRA

View raw message