beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Halperin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (BEAM-383) BigQueryIO: update sink to shard into multiple write jobs
Date Tue, 28 Jun 2016 19:12:57 GMT
Daniel Halperin created BEAM-383:
------------------------------------

             Summary: BigQueryIO: update sink to shard into multiple write jobs
                 Key: BEAM-383
                 URL: https://issues.apache.org/jira/browse/BEAM-383
             Project: Beam
          Issue Type: Bug
          Components: sdk-java-gcp
            Reporter: Daniel Halperin


BigQuery has global limits on both the # files that can be written in a single job and the
total bytes in those files. We should be able to modify BigQueryIO.Write to chunk into multiple
smaller jobs that meet these limits, write to temp tables, and atomically copy into the destination
table.

This functionality will let us safely stay within BQ's load job limits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message