crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-308) Upgrade to Hadoop 2.2.0 and HBase 0.96
Date Wed, 04 Dec 2013 16:47:37 GMT


Josh Wills commented on CRUNCH-308:

[~stepinto] that is correct, the Target is allowed to override the PType's normal Converter
instance in these cases.

The only real functional change is in how we do the sort for HBase bulk loads, where I took
what we had, which was doing the partitioned sort on the Writable KeyValue objects, and changed
it to do the sort on the ImmutableBytesWritable form of KeyValue.getRow(). This was how this
was done in HBase 0.96:

I didn't see a good way to do the partitioned sort for the bulk load without changing the
code, because sorting on the non-Writable KeyValue (instead of the Writable ImmutableBytesWritable)
would have involved changing a bunch of the Sort.sort() logic

> Upgrade to Hadoop 2.2.0 and HBase 0.96
> --------------------------------------
>                 Key: CRUNCH-308
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Josh Wills
>         Attachments: CRUNCH-HBASE96.patch
> As discussed on dev@crunch, we should update Crunch to run against the new mainline releases
of Hadoop (2.2.0) and HBase (0.96).
> There isn't a good way to maintain a shim between HBase 0.94 and HBase 0.96 due to a
number of API changes, so this change means that support for HBase 0.94 will remain in the
0.8.x sequence of Crunch releases, and 0.96 will be the supported version from 0.9.0 onwards.

This message was sent by Atlassian JIRA

View raw message