beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pablo Estrada (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-722) Add Display Data to the Python SDK
Date Fri, 14 Oct 2016 19:59:20 GMT

    [ https://issues.apache.org/jira/browse/BEAM-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15576336#comment-15576336
] 

Pablo Estrada commented on BEAM-722:
------------------------------------

I am working on adding this feature. I'm basically mirroring the way in which we do this in
Java, but trying to make it a bit more Pythonic. Here's a quick example of how this would
work:

{code:title=display_data_example.py|borderStyle=solid}
class MyParDo(beam.PTransform):
  def display_data(self):
    return {'disp_data_key': MyParDo,
            'loneliest_number': 1,
            'secret_url': {'value': 'awebsite.com', 'url': 'http://awebsite.com'} ,
            'fn': {'value': self.fn, 'label': 'Display data of underlying DoFn'}
           }

{code}

I'm renaming the populateDisplayData function to display_data, and instead of using a builder,
I return a dictionary of key:value pairs - and if a user wants to specify more parameters
than just the key:value, they can pass a dictionary with them. Also, if the user passes as
value an object that inherits from the HasDisplayData class, then that object's display data
will be included.

I have a [small commit|https://github.com/pabloem/incubator-beam/commit/f3c7ebd24ecfd0b46aa4b2d6c906c4c1331fd13a]
that adds this. You can see some [examples in the few unittests|https://github.com/pabloem/incubator-beam/commit/f3c7ebd24ecfd0b46aa4b2d6c906c4c1331fd13a#diff-74a8ae565b6cf2631423124a587c2beaR1].

If everyone is okay with this, I'll add comments, tests, and address feedback.

> Add Display Data to the Python SDK
> ----------------------------------
>
>                 Key: BEAM-722
>                 URL: https://issues.apache.org/jira/browse/BEAM-722
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py
>            Reporter: Pablo Estrada
>            Assignee: Frances Perry
>
> The DisplayData feature has been added to the Java SDK (see blog post announcing it:
https://cloud.google.com/blog/big-data/2016/06/dataflow-updates-see-more-details-about-your-pipelines).
We need now to add it to the Python SDK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message