hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wilm Schumacher <wilm.schumac...@cawoom.com>
Subject Re: Thrift getRows API with column filtering?
Date Sat, 22 Nov 2014 08:55:40 GMT

1.) the plan with minimizing the number of column families is very wise

2.) which API you are using? thrift or thrift2? The original (or old)
thrift api does not seem to support your plan.

However, it is supported. See e.g. http://hbase.apache.org/book/thrift.html

the thrift2-api seems to be the way to go for you (I actually never used
it). But it is very poor documented. So IF you want to do it with
hbase-thrift, you are able to to that, but it will not make you happy ;).

3.) I would not recommend your plan. I started just like you, with
exactly your plan of using the native thrift support of hbase by node.
And for long time it worked just fine. But I realized that it is MUCH
easier to create a thrift server on my own (exactly for the purpose and
problem), do the hbase stuff with the native java client and serve the
results by my custom thrift server to the node application.

The two advantages of this plan are
a) no problems such like you are describing. You have the full power of
the hbase client at hand
b) It furthermore gives you an extra plus on security. You can seperate
the node server completely from the db server. If an evil guy takes over
your server, with your plan he would be able to make a full scan on your
db. With the custom thrift server this wouldn't be possible that easy,
because he would only be able to use your (minimal) api. Example post

Hope it helps



ps: example for security gain of doing it the way proposed in point 3:

* Imagine a passwd check.

a) using thrift you would fetch a user from the db, get the salted hash,
and compare it with the POST'ed passwd. Works quite well

BUT: if a evil guy takes over your node server he would be able to make
a scan on the user table and would encounter EVERY e-mail, name, perhaps
birthdate AND the hash

b) using custom thrift you would create a function in the thrift server

CustumUserStruct login( 1:string username , 2:string hash ) throws
(1:InvalidCredentials , 2:SomeError)

or so. So if the evil guy would take over your node server, he still
would have to make a brute force attack on the passwd, and he still
would have to guess the username. But this time on your server, so he
would only have the time you need to find the intrusion and kick him out.

Am 21.11.2014 um 18:17 schrieb JM Tremblay:
> Hi,
> I'm using the node.js HBase Thrift client.  I can use getRows() to fetch
> specific rows with all their columns or getRowsWithColumns() to specify the
> columns or column families to return.  But I can't figure out how to
> specify columns starting with a given prefix, as it seems to be possible
> with the Java API.
> These are columns created dynamically and the client doesn't know their
> name in advance (apart from the prefix).  I'm trying to avoid fetching all
> the columns and filter on the client side to minimize the size of data
> transferred. I'm also trying to avoid having a lot of column families (one
> for each prefix would be too much).
> What are my options?
> JM

View raw message