crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Micah Whitacre (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-315) Empty collection
Date Sat, 28 Dec 2013 03:35:52 GMT


Micah Whitacre commented on CRUNCH-315:

I would assume it would have to be serialized to disk.  Most of the time in our cases where
we've needed a collection like that since the data size is typically a lot smaller we've done
mapside joins or injected that in memory collection into the DoFn with custom serialization
logic if needed. 

> Empty collection
> ----------------
>                 Key: CRUNCH-315
>                 URL:
>             Project: Crunch
>          Issue Type: New Feature
>            Reporter: Chao Shi
>         Attachments: CRUNCH-315.patch
> As discussed in the mailing list [1] and [2], I'd like to add an empty collection feature.
On the API side, I think we can add a new method in Pipeline to create an empty collection.
The collection should be a subclass of PCollection and behaves like other normal PCollecitons.
There are also some optimization points that Josh mentioned in [2].
> I haven't thought it clearly. Just put a ticket here and see if anyone else has a better
> [1]
> [2]

This message was sent by Atlassian JIRA

View raw message