beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amit Sela (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-757) The SparkRunner should utilize the SDK's DoFnRunner instead of writing it's own.
Date Tue, 18 Oct 2016 07:14:58 GMT

    [ https://issues.apache.org/jira/browse/BEAM-757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15584706#comment-15584706
] 

Amit Sela commented on BEAM-757:
--------------------------------

I have something (working!) using the {{SimpleDoFnRunner}} here: https://github.com/amitsela/incubator-beam/blob/BEAM-757-WIP/runners/spark/src/main/java/org/apache/beam/runners/spark/util/SparkDoFnRunner.java

I had to expose OldDoFn and OutputManager for that (https://github.com/amitsela/incubator-beam/blob/BEAM-757-WIP/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java).

As for OldDoFn - I had to call setup() and teardown() - the DoFnRunner didn't seem to do that,
maybe it should ? (Spark might need to override this anyway for teardown after finishBundle,
but still).

OutputManager needed to be exposed to allow the runner to access the output, and in Spark's
case clear it in-between element processing because Spark partitions (~bundles) can be quite
big.

Other then that, it was pretty straight forward for me to use it. I'll PR once pending PRs
are merged.

> The SparkRunner should utilize the SDK's DoFnRunner instead of writing it's own.
> --------------------------------------------------------------------------------
>
>                 Key: BEAM-757
>                 URL: https://issues.apache.org/jira/browse/BEAM-757
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-spark
>            Reporter: Amit Sela
>            Assignee: Amit Sela
>
> The SDK now provides DoFnRunner implementations, and so to avoid maintaining against
the SDK, the runner should leverage the runner API instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message