crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <>
Subject [jira] [Updated] (CRUNCH-127) Allow multiple HBaseTargets in a single pipeline
Date Wed, 12 Dec 2012 08:27:21 GMT


Josh Wills updated CRUNCH-127:

    Attachment: CRUNCH-127.patch

First cut at this. I banged my head against making HBaseTarget work w/MultipleOutputs, to
no avail. In the process, I rewrote most of the MultipleOutputs stuff to make it work more
like CrunchInputs, which has some advantages (and some disadvantages) that might be worth
exploring later.

In the meantime, here's a simple patch that adds in support for HBase's MultiTableOutputFormat.
For this change, the key is the name of the table to write, and the value is either a Put
or a Delete, so it needs to be given a PTable<ImmutableBytesWritable, Put|Delete> in
order to work. Still need to write an integration test, but let me know if you get a chance
to bang on it.
> Allow multiple HBaseTargets in a single pipeline
> ------------------------------------------------
>                 Key: CRUNCH-127
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Micah Whitacre
>            Assignee: Josh Wills
>         Attachments: CRUNCH-127.patch
> Currently when a pipeline contains writes to multiple HBaseTargets, all puts are being
sent to the first configured HBaseTarget ignoring the second one and causing issues if the
columns are not the same.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message