beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-3088) BigQuery source should consider streaming buffer when determining estimated sizes of tables
Date Thu, 02 Nov 2017 22:22:00 GMT

    [ https://issues.apache.org/jira/browse/BEAM-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236699#comment-16236699
] 

ASF GitHub Bot commented on BEAM-3088:
--------------------------------------

Github user asfgit closed the pull request at:

    https://github.com/apache/beam/pull/4025


> BigQuery source should consider streaming buffer when determining estimated sizes of
tables
> -------------------------------------------------------------------------------------------
>
>                 Key: BEAM-3088
>                 URL: https://issues.apache.org/jira/browse/BEAM-3088
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-gcp
>            Reporter: Chamikara Jayalath
>            Assignee: Chamikara Jayalath
>            Priority: Major
>
> Currently BigQuery table source determines estimated size using table.numBytes property.
> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryTableSource.java#L100
> If BigQuery table has data in the streaming buffer, size of that data will not be reflected
by table.numBytes. To better estimate size of table, data in the streaming buffer has to be
considered as well. Size of data in streaming buffer can be determined based on property streamingBuffer.estimatedBytes
according to following.
> https://cloud.google.com/bigquery/docs/reference/rest/v2/tables



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message