gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GORA-116) gora treats split points as if they represent actual values in the table
Date Wed, 11 Apr 2012 12:20:17 GMT

    [ https://issues.apache.org/jira/browse/GORA-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251497#comment-13251497

Eric Newton commented on GORA-116:

If I know in advance the distribution of my table, I can split it early to increase the speed
at which I can ingest new data.  I just used the shortest representation of the splits I wanted.

Accumulo uses the minimum difference between two rows as its natural split point in order
to keep the metadata about splits small.  Keith worked around this in the Accumulo back-end.

It's a little more complicated than just padding.  If I split at \x10, \x10\x00 comes after
\x10, which puts it in the next tablet.  It will probably all just work since we're just providing
a locality hint, and a little spillage off the ends of the tablet isn't going to impact performance
that much.
> gora treats split points as if they represent actual values in the table
> ------------------------------------------------------------------------
>                 Key: GORA-116
>                 URL: https://issues.apache.org/jira/browse/GORA-116
>             Project: Apache Gora
>          Issue Type: Bug
>          Components: storage-hbase
>            Reporter: Eric Newton
>            Priority: Minor
> Doing goraci testing with the hbase back-end for gora.  I created single-byte split points.
 When I tried to map-reduce over the table, the gora back-end failed trying to convert the
split points into Longs.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message