hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: about HBASE-48
Date Sun, 11 Oct 2009 06:54:43 GMT
On Sat, Oct 10, 2009 at 10:54 PM, Anty <anty.rao@gmail.com> wrote:

> Hi:
>    statck
>     i did some tests on bulk load tools of HBASE-48.

Thanks for trying it out.

> I took files made by TestHFileOutputFormat test and passed them to the
> script you wrote.It did works ,but it seems to be something unusual.For
> each
> region ,the STARTKEY and ENDKEY is nearly the same,the ENDKY is bigger than
> STARTKEY by nearly 1,e.g.
>  STARTKEY=>'0000009447',ENDKY=>'0000009448';
>  STARTKEY=>'0000020476',ENDKY=>'0000020477';
> ...
Did you do your own partitioner or just use default hash partitioner?

>        i also have some doubts about TestHFileOutputFormat,the default
> partitioner is hash partitioner,however ,the hash partitioner can't meet
> requirements of TestHFileOutputFormat ,just as you said we need to ensure a
> total ordering of all keys and we need to supply a partitioner that does
> total ordering(but you didn't add a new  partitioner in
> TestHFileOutputFormat).

This is broke then as you point out.   We should make something like what is
described in https://issues.apache.org/jira/browse/HBASE-1901 for

>   so ,I think TestHFileOutputFormat use the hash partitionar ,it does not
> do  totoal ordering,different regions would have rows intercross ,which is
> not correct for hbase.And I found the firstKey,lastKey of the files mady by
> TestHFileOutputFormat is indeed intercross.
>    if the bulk tools is just the beginning,needed further improvement?I
> think the bulk tools is very usefull.
Can you help us improve it?  What do you think we need to do next

Thanks for writing Anty Rao.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message