beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From echauc...@apache.org
Subject [beam] branch spark-runner_structured-streaming updated (f0522dc -> dcb3949)
Date Thu, 27 Jun 2019 15:03:17 GMT
This is an automated email from the ASF dual-hosted git repository.

echauchot pushed a change to branch spark-runner_structured-streaming
in repository https://gitbox.apache.org/repos/asf/beam.git.


    from f0522dc  [to remove] temporary: revert extractKey while combinePerKey is not done
(so that it compiles)
     new 884e5f90 Apply a groupByKey avoids for some reason that the spark structured streaming
fmwk casts data to Row which makes it impossible to deserialize without the coder shipped
into the data. For performance reasons (avoid memory consumption and having to deserialize),
we do not ship coder + data. Also add a mapparitions before GBK to avoid shuffling
     new 11c3792  Fix case when a window does not merge into any other window
     new dcb3949  Fix wrong encoder in combineGlobally GBK

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../batch/AggregatorCombinerGlobally.java          |  7 +++-
 .../batch/CombineGloballyTranslatorBatch.java      | 44 ++++++++++++++++++----
 2 files changed, 42 insertions(+), 9 deletions(-)


Mime
View raw message