beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xu Mingmin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-68) Support for limiting parallelism of a step
Date Mon, 06 Mar 2017 21:32:33 GMT

    [ https://issues.apache.org/jira/browse/BEAM-68?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898151#comment-15898151
] 

Xu Mingmin commented on BEAM-68:
--------------------------------

This's required by some runners. With this parameter, runners, like Flink/Storm can leverage
it, and those, like Dataflow can ignore it.
I'm not sure about the existing implementation of Flink runner, seems like set in job level,
meaning same parallelism for each step.

FYI Flink parallel https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/parallel.html

Storm parallel http://storm.apache.org/releases/1.0.1/Understanding-the-parallelism-of-a-Storm-topology.html

> Support for limiting parallelism of a step
> ------------------------------------------
>
>                 Key: BEAM-68
>                 URL: https://issues.apache.org/jira/browse/BEAM-68
>             Project: Beam
>          Issue Type: New Feature
>          Components: beam-model
>            Reporter: Daniel Halperin
>
> Users may want to limit the parallelism of a step. Two classic uses cases are:
> - User wants to produce at most k files, so sets TextIO.Write.withNumShards(k).
> - External API only supports k QPS, so user sets a limit of k/(expected QPS/step) on
the ParDo that makes the API call.
> Unfortunately, there is no way to do this effectively within the Beam model. A GroupByKey
with exactly k keys will guarantee that only k elements are produced, but runners are free
to break fusion in ways that each element may be processed in parallel later.
> To implement this functionaltiy, I believe we need to add this support to the Beam Model.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message