flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabian Hueske (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1098) flatArray() operator that converts arrays to elements
Date Tue, 16 Sep 2014 21:34:34 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136277#comment-14136277

Fabian Hueske commented on FLINK-1098:

The current APIs support many use-cases and recent additions and proposal are rather shortcuts
than adding new functionality. I think we are at a point now, where we should think about
whether we want (1) an API with many built-in features or (2) a concise set of the most common

While (1) would mean a very rich feature set which could make many things very comfortable
for users, it has some drawbacks such has high maintenance effort (incl. documentation and
porting to other language bindings) and a potentially bloated API which makes it hard for
new users to find their way around.
On the other hand (2) offers less user comfort, but is easier to maintain and easy to become
familiar with. 

A compromise could be to extract some of the non-fundamental features from DataSet and put
them into some add-on operator package. That way we could maintain a concise API while having
the option to use a rich operator set.

I am not strictly against adding new operators to the APIs but I think we should have a discussion
about this issue. 
I tend to go with the second option (concise API). If we find a way go with the add-on operator
package, even better.

What do you think?

> flatArray() operator that converts arrays to elements
> -----------------------------------------------------
>                 Key: FLINK-1098
>                 URL: https://issues.apache.org/jira/browse/FLINK-1098
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Timo Walther
>            Priority: Minor
> It would be great to have an operator that converts e.g. from String[] to String. Actually,
it is just a flatMap over the elements of an array.
> A typical use case is a WordCount where we then could write:
> {code}
> text
> .map((line) -> line.toLowerCase().split("\\W+"))
> .flatArray()
> .map((word) -> new Tuple2(word, 1))
> .groupBy(0)
> .sum(1);
> {code}

This message was sent by Atlassian JIRA

View raw message