crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan Brush (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-246) HFileSource
Date Wed, 07 Aug 2013 15:24:50 GMT


Ryan Brush commented on CRUNCH-246:

@Micah: fair enough. I actually did this (prior to Crunch) for a similar need of manipulating
HFiles generated outside of HBase. Just wanted to share the perception that such a thing could
be prone to misuse.

In any case, if we want to move forward with this, I can dust off the implementation I had
and post it, at least as a reference here. It needs some focused unit and integration tests,
but is pretty similar to the links Chao posted in the original description.

> HFileSource
> -----------
>                 Key: CRUNCH-246
>                 URL:
>             Project: Crunch
>          Issue Type: Improvement
>          Components: IO
>            Reporter: Chao Shi
>            Assignee: Chao Shi
> I found this useful when directly perform MR on HFiles. I used it yesterday when copying
a bunch of HFiles to another cluster (where the region layout is different).
> There is no HFileInputFormat provided by HBase, but I found the following from google:
> (Java version of the above. The
webpage is in chinese, but you can see the code)
> I'm not sure if we copy their code directly (copyright issue?). Anyone knows?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message