hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Sturm <mas9...@nyp.org>
Subject question about composite rowKey and performance difference between getScanner() and get(Get[])
Date Wed, 03 Dec 2014 20:31:20 GMT

I have a many to many relationship that I am trying to model in hbase, and I want to be sure
I am not missing anything so please let me know or point to the right documentation.

Let's say I have an A to B many to many relationship, the query parameter takes A unique id
and returns all the B uniqueids related to A with their properties and values.

The first solution I found is having two tables: one with the rowKey equal to A's unique id,
the table column identifiers are equal to B's unique ids related to A, the second table has
its rowKeys equal to B unique ids and its columns contain the property values. So the query
is two steps, it first does a get on A to collect all the B uniqueIds and then does a second
get on the B passing as a parameter an array of B rowkeys. When I run the second query, I
can get a latency much longer on the first query and then good low latency on subsequent queries
with same parameter. I believe that's a caching issue...

The second solution is having one table with a composite rowkey equal to A uniqueid + B uniqueid,
I will then have duplicate B uniqueid rows. But when I do a scan on the just the first part
of the rowKey (A uniqueid) the response time and latency is more consistent and better (smaller).

So, my questions are threefold: 1) which way is the best, 2) what is the performance difference
between a scan and a get with multiple rowkeys (I think scan is faster because the data is
not or less "distributed") and 3) how can we make the get with multiple rowkeys more consistent?

Thank you for your help,

This electronic message is intended to be for the use only of the named recipient, and may
contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message