beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Frances Perry (JIRA)" <>
Subject [jira] [Updated] (BEAM-12) Apply GroupByKey transforms on PCollection of normal type other than KV
Date Sun, 14 Feb 2016 15:58:18 GMT


Frances Perry updated BEAM-12:
       Assignee: Frances Perry
       Priority: Trivial  (was: Major)
    Component/s: sdk-java-core

If you need to do something to the elements to extract the key before grouping, you can use
a ParDo (or a derivative like MapElements). So something like:
input.apply(ParDo.of(new ExtractFn()))
        .apply(GroupByKey.<K, V>create());

I'm not sure what you meant by automatically extracting keys from data -- that sounds like
something that would application or domain specific.

As always, if you find yourself using a pattern often in your applications, you can create
your own composite PTransform do it more compactly.

> Apply GroupByKey transforms on PCollection of normal type other than KV
> -----------------------------------------------------------------------
>                 Key: BEAM-12
>                 URL:
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: bakeypan
>            Assignee: Frances Perry
>            Priority: Trivial
> Now the GroupByKey transforms can only apply on PCollection<KV<K,V>>.So I
have to transform PCollection<T> to PCollection<KV<K,V>> before I want to
apply GroupByKey.
> I think we can do better by apply GroupByKey on normal type of PCollection other than
KV.And user can offer one custome extract key function or we can offer default extract key
function.Just like this:
> PCollection<T> input = ...
> PCollection<KV<K,Iterable<V>>> result = input.apply(GroupByKey.<K,
V>create(new ExtractFn()));

This message was sent by Atlassian JIRA

View raw message