beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maximilian Roos (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-2767) BigQueryIO result different for REPEATED field between DirectRunner and DataflowRunner
Date Thu, 22 Mar 2018 19:29:00 GMT

    [ https://issues.apache.org/jira/browse/BEAM-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410168#comment-16410168
] 

Maximilian Roos commented on BEAM-2767:
---------------------------------------

Is there any way to get around this on Python? I'm happy to make significant adjustments.
It's extremely difficult to debug a pipeline with this issue.

My nuclear option is to change all my queries to return unnested data, and then re-nest in
DataFlow.

> BigQueryIO result different for REPEATED field between DirectRunner and DataflowRunner
> --------------------------------------------------------------------------------------
>
>                 Key: BEAM-2767
>                 URL: https://issues.apache.org/jira/browse/BEAM-2767
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp, runner-dataflow, runner-direct
>    Affects Versions: 2.0.0
>            Reporter: Andre
>            Assignee: Chamikara Jayalath
>            Priority: Minor
>
> When running a query against BigQueryIO with a REPEATED RECORD field the behavior is
different between DirectRunner and DataflowRunner. The field containing the repeated record
has to be cast to access the records. Apparently the following implementations work for each
runner but I would expect them to be the same as my pipeline otherwise only runs on one.
> DirectRunner:
> {code:java}
> ArrayList<LinkedHashMap> orderLines = (ArrayList<LinkedHashMap>) c.element().get("RepeatedField");
> {code}
> DataflowRunner:
> {code:java}
> ImmutableList<TableRow> orderLines = (ImmutableList<TableRow>) c.element().get("RepeatedField");
> {code}
> 				
> For example when using the ImmutableList implementation on DirectRunner the following
exception is thrown:
> {code:java}
> java.lang.ClassCastException: java.util.ArrayList cannot be cast to com.google.common.collect.ImmutableList
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message