beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Work logged] (BEAM-3089) Issue with setting the parallelism at client level using Flink runner
Date Tue, 18 Sep 2018 20:04:00 GMT


ASF GitHub Bot logged work on BEAM-3089:

                Author: ASF GitHub Bot
            Created on: 18/Sep/18 20:03
            Start Date: 18/Sep/18 20:03
    Worklog Time Spent: 10m 
      Work Description: angoenka commented on a change in pull request #6426: [BEAM-3089]
Fix default values in FlinkPipelineOptions / Add tests

 File path: runners/flink/src/main/java/org/apache/beam/runners/flink/
 @@ -56,12 +56,13 @@
       "Address of the Flink Master where the Pipeline should be executed. Can"
           + " either be of the form \"host:port\" or one of the special values [local], "
           + "[collection] or [auto].")
+  @Default.String("[auto]")
   String getFlinkMaster();
   void setFlinkMaster(String value);
   @Description("The degree of parallelism to be used when distributing operations onto workers.")
-  @Default.InstanceFactory(DefaultParallelismFactory.class)
+  @Default.Integer(-1)
 Review comment:
   Nit: update the description to signify <= 0 meaning

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

Issue Time Tracking

    Worklog Id:     (was: 145459)
    Time Spent: 2h  (was: 1h 50m)

> Issue with setting the parallelism at client level using Flink runner
> ---------------------------------------------------------------------
>                 Key: BEAM-3089
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 2.0.0
>         Environment: I am using Flink 1.2.1 running on Docker, with Task Managers distributed
across different VMs as part of a Docker Swarm.
>            Reporter: Thalita Vergilio
>            Assignee: Grzegorz KoĊ‚akowski
>            Priority: Major
>              Labels: docker, flink, parallel-deployment
>             Fix For: 2.8.0
>         Attachments: flink-ui-parallelism.png
>          Time Spent: 2h
>  Remaining Estimate: 0h
> When uploading an Apache Beam application using the Flink Web UI, the parallelism set
at job submission doesn't get picked up. The same happens when submitting a job using the
Flink CLI.
> In both cases, the parallelism ends up defaulting to 1.
> When I set the parallelism programmatically within the Apache Beam code, it works: {{flinkPipelineOptions.setParallelism(4);}}
> I suspect the root of the problem may be in the org.apache.beam.runners.flink.DefaultParallelismFactory
class, as it checks for Flink's GlobalConfiguration, which may not pick up runtime values
passed to Flink, then defaults to 1 if it doesn't find anything.
> Any ideas on how this could be fixed or worked around? I need to be able to change the
parallelism dynamically, so the programmatic approach won't really work for me, nor will setting
the Flink configuration at system level.

This message was sent by Atlassian JIRA

View raw message