beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Work logged] (BEAM-5427) Fix sample code (AverageFn) in Combine.java
Date Fri, 21 Sep 2018 18:12:00 GMT

     [ https://issues.apache.org/jira/browse/BEAM-5427?focusedWorklogId=146519&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-146519
]

ASF GitHub Bot logged work on BEAM-5427:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/Sep/18 18:11
            Start Date: 21/Sep/18 18:11
    Worklog Time Spent: 10m 
      Work Description: youngoli commented on a change in pull request #6439: [BEAM-5427]
Fix and update sample code for CombineFn.
URL: https://github.com/apache/beam/pull/6439#discussion_r219583784
 
 

 ##########
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Combine.java
 ##########
 @@ -289,6 +305,21 @@ private Combine() {
    * arbitrary tree structure. Commutativity is required because any order of the input values
is
    * ignored when breaking up input values into groups.
    *
+   * <h3>Note on Data Encoding</h3>
+   *
+   * <p>Data encoding is required to make CombineFn work. The sample code above implements
{@code
+   * java.io.Serializable} so that CombineFn in consequence makes use of SerializableCoder
to encode
+   * the data. In many cases though, relying on Serializable could be less preferred for
efficiency
+   * considerations. In addition, Serializable may not be applicable in all situations. There
are
+   * two alternatives to make data encoding work for AverageFn:
+   *
+   * <ul>
+   *   <li>Generic Coder class. Implement an encoder class explicitly and use {@code
@DefaultCoder}
+   *       tag.
+   *   <li>{@link CombineFn} Specific way. While extending {@link CombineFn}, overwrite
both {@link
+   *       #getAccumulatorCoder} and {@link #getDefaultOutputCoder}.
+   * </ul>
+   *
 
 Review comment:
   I think I get what you're trying to explain in this note, but the grammer is a little bit
awkward. I'd like to suggest this to make it sound more natural:
   
   ```
      * <h3>Note on Data Encoding</h3>
      *
      * <p>Some form of data encoding is required when using custom types in a CombineFn
which do not
      * have well-known coders. The sample code above uses a custom Accumulator which it encodes
by
      * making it {@code java.io.Serializable}. However in cases where {@code java.io.Serializable}
is
      * not efficient or applicable, there are two alternatives for encoding:
      *
      * <ul>
      *   <li>Generic coder class: Implement a coder class explicitly and use the {@code
@DefaultCoder}
      *       tag.
      *   <li>CombineFn specific way: While extending CombineFn, overwrite both {@link
      *       #getAccumulatorCoder} and {@link #getDefaultOutputCoder}.
      * </ul>
   ```
   If any parts of that sound incorrect feel free to fix.
   
   Also, although I can review the writing in this documentation change, I can't really review
the correctness of it since I'm not very familiar with coders. I would recommend adding a
reviewer who is familiar with coders and can vet that this is correct. Maybe lukecwik since
he's also a committer and can also take care of  merging.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 146519)
    Time Spent: 1h 20m  (was: 1h 10m)

> Fix sample code (AverageFn) in Combine.java
> -------------------------------------------
>
>                 Key: BEAM-5427
>                 URL: https://issues.apache.org/jira/browse/BEAM-5427
>             Project: Beam
>          Issue Type: Improvement
>          Components: examples-java
>            Reporter: Ruoyun Huang
>            Assignee: Ruoyun Huang
>            Priority: Minor
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Sample code missing coder. 
> In its current state, job run fails with Coder missing error. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message