hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John <johnnyenglish...@gmail.com>
Subject Wrong HBase Sort Order with Pig
Date Fri, 13 Sep 2013 14:38:50 GMT
Hi, I already ask this on the pig mailing list. But because I'm not sure if
it is a Pig or HBase issue, I will ask here too since the Pig Function is
using a hbae scan operation. Here is my Questions:

I have created a HBase Table in the hbase shell and added some data. In
http://hbase.apache.org/book/dm.sort.html is written that the datasets are
first sorted by the rowkey and then the column. So I tried something in the
HBase Shell: http://pastebin.com/gLVAX0rJ

Everything looks fine. I got the right order a -> c -> d like expected.

Now I tried the same with Apache Pig in Java: http://pastebin.com/jdTpj4Fu

I got this result:


So, now the order is c -> d -> a. That seems a little odd to me, shouldn't
it be the same like in HBase? It's important for me to get the right order
because I transform the map afterwards into a bag and then join it with
other tables. If both inputs are sorted I could use a merge join without
sorting these two datasets. So does anyone know how it is possible to get
the sorted map (or bag) of the columns?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message