beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (BEAM-1925) Make DoFn invocation logic of Python SDK more extensible
Date Mon, 08 May 2017 20:47:04 GMT


ASF GitHub Bot commented on BEAM-1925:

Github user sb2nov closed the pull request at:

> Make DoFn invocation logic of Python SDK more extensible
> --------------------------------------------------------
>                 Key: BEAM-1925
>                 URL:
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py
>            Reporter: Chamikara Jayalath
>            Assignee: Chamikara Jayalath
> DoFn invocation logic of Python SDK is currently in DoFnRunner class.
> At initialization of this, we parse a DoFn and create local state. We use this state
when invoking DoFn methods process, start_bundle, and finish_bundle. For example, we store
a list of  ArgPlaceholder objects within the state of DoFnRunner to facilitate invocation
of process method.
> We will need to extend this functionality when adding new features to DoFn class (for
example to support Splittable DoFn [1]). So I think it's good to refactor this code to be
more extensible. 
> I think a good approach for this is to add DoFnInvoker and DoFnSignature classes similar
to Java SDK [2].
> In this approach:
> A DoFnSignature captures the signature of a DoFn including methods and arguments.
> A DoFnInvoker implements a particular way DoFn methods will be executed (initially we'll
have simple and per-window invokers [3]).
> A runner uses DoFnRunner to execute methods of a given DoFn. At initialization, DoFnRunner
crates a DoFnSignature and a DoFnInvoker for the given DoFn.
> DoFnSignature and DoFnInvoker methods will be used by SplittableDoFn implementation as
> [1]
> [2]
> [3]

This message was sent by Atlassian JIRA

View raw message