crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao Shi (JIRA)" <>
Subject [jira] [Resolved] (CRUNCH-267) Fix several HFileUtils#scanHFiles related problems
Date Fri, 20 Sep 2013 04:09:52 GMT


Chao Shi resolved CRUNCH-267.

    Resolution: Fixed
      Assignee: Chao Shi

committed to master
> Fix several HFileUtils#scanHFiles related problems
> --------------------------------------------------
>                 Key: CRUNCH-267
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Chao Shi
>            Assignee: Chao Shi
>         Attachments: crunch-267.patch
> This patch fixes several problems about HFileUtils#scanHFiles that are discovered on
our production cluster.
> 1. The usage of "" is wrong
> Returning -1 indicating all KVs in the HFile is greater than the given key, so we should
continue to scan. So I replaced it with seekAtOrAfter, which is copied from HBase code, and
added a few tests (testScanFiles_startRow{IsTooSmall, IsTooLarge, DoesNotExist) to cover this.
> 2. The default implementation of HFileSource#getSize does not estimate correctly the
size of input, if the input HFiles are in sub-directory (i.e. input/family/hfile)
> 3. There are some tricky cases about Delete/DeleteColumn. I added some test cases and
fix related code. (Hopefully my test case can cover this.)

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message