beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kenneth Knowles (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (BEAM-741) Values transform does not use the correct output coder when values is an Iterable<T>
Date Tue, 21 Mar 2017 02:22:41 GMT

    [ https://issues.apache.org/jira/browse/BEAM-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581093#comment-15581093
] 

Kenneth Knowles edited comment on BEAM-741 at 3/21/17 2:21 AM:
---------------------------------------------------------------

Great investigation. I actually think the SDK should also always prefer the transform's coder.
But, also, for input of type {{KV<K,V>}}, the expected behavior is for the registry
to associate the type {{V}} with the value coder and thus in this context provide exactly
the same coder. So I'm going to reopen and see about both of these.

I am struck by this conflict: the transform has some more detailed information about its output,
but also if the user sets a coder on the input PCollection, they have even more information
than a transform with a type variable, like {{Values}}. Maybe they know something about the
data distribution. If both the registry and each transform try to adhere to the rule of propagating
the user's intent, I think they should end up largely equivalent.


was (Author: kenn):
Great investigation. I actually think the SDK should also always prefer the transform's coder.
But, also, for input of type {{KV<K,V>}}, the expected behavior is for the registry
to associate the type {{V}} with the value coder and thus in this context provide exactly
the same coder. So I'm going to reopen and see about both of these.

I am struck by this conflict: the transform has some more detailed information about its output,
but also if the user sets a coder on the input PCollection, they have even more information
that a transform with a type variable, like {{Values}}. Maybe they know something about the
data distribution. If both the registry and each transform try to adhere to the rule of propagating
the user's intent, I think they should end up largely equivalent.

> Values transform does not use the correct output coder when values is an Iterable<T>
> ------------------------------------------------------------------------------------
>
>                 Key: BEAM-741
>                 URL: https://issues.apache.org/jira/browse/BEAM-741
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Andrew Martin
>            Assignee: Kenneth Knowles
>             Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message