hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billy Pearson" <sa...@pearsonwholesale.com>
Subject Re: Some REST GET questions
Date Tue, 24 Mar 2009 02:41:19 GMT
also note you might look in to using thrift I thank it took over a lot user 
from rest.
The support for keeping rest up todate and tested may not be there any more.

and the php class is from a long time ago 9/2008 there has been lots of 
changes in hbase sense then.


"Chris Hostetter" <hossman_hbase@fucit.org> 
wrote in message news:Pine.LNX.4.64.0903231341200.22171@radix.cryptio.net...
> I've got myself a little HBase install up and running on a small Hadoop 
> cluster, currently running...
>  HBase Version 0.19.0, r735381
>  HBase Compiled Sun Jan 18 14:29:34 PST 2009, stack
>  Hadoop Version 0.19.0, r713890
>  Hadoop Compiled Fri Nov 14 03:12:29 UTC 2008, ndaley
> testing stuff out with the hbase shell, things are working nicely.  I'm 
> also using trying out the REST API, and I have a few questions about
> how to execute certain queries.
> First off, this is the table i'm testing with...
> {NAME => 'userdata', IS_ROOT => 'false', IS_META => 'false',
>  FAMILIES => [{NAME => 'hist', BLOOMFILTER => 'false', COMPRESSION => 
> 'NONE', VERSIONS => '20', LENGTH => '2147483647', TTL => '-1', IN_MEMORY 
> => 'false', BLOCKCACHE => 'false'}, {NAME => 'user', BLOOMFILTER => 
> 'false', COMPRESSION => 'NONE', VERSIONS => '1', LENGTH => '2147483647', 
> TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}], INDEXES => []}
> This hypothetical example being a user activity tracking system -- the 
> "keys' will be usernames, and for every action a user takes, a row will be 
> inserted into the userdata table.  for some of the data i only care about 
> the last action the user took, and i put that in the "user" column family 
> (only 1 version) and for other pieces of data i want to keep a history of 
> the last 20 actions the user took (the "hist" column family)
> My first question is about clarifying what should/shoulnd't be base64 
> encoded.  According to the wiki docs for hte rest interface...
>   http://wiki.apache.org/hadoop/Hbase/HbaseRest
> ...the "value" portion of a column entry is base64 encoded, but the "name" 
> is not -- this matches the behavior i observe when POSTing data and then 
> inspecting it using the hbase shell -- however when I GET results from a 
> query using the REST interface, the names are coming back base64 encoded 
> as well.  This message from a year ago seems to suggest that this is the 
> expected behavior because names "can be arbitrary binary strings." ...
>   http://markmail.org/message/dyrnxphcjp3g4ow4
> ...but in that case there is API descrepency between the I and the O in 
> the I/O of the REST interface.  which is considered more correct? is there 
> a migration plan for rectifying the discrepency?
> Second Question: querying for multiple version.  I'm trying to figure out 
> how i can execute the following query (from the hbase shell) via the REST 
> interface...
>    get 'userdata', 'hossman', {COLUMN => 'hist:vote', VERSIONS => 10}
> ...my naive assumption based on the other examples on the wiki are that 
> something like this might work...
>    http://host:60010/api/userdata/row/hossman?column=hist:vote&versions=10
> ...but the "versions" request param seems to be ignored.  Is this type of 
> multi-version query at all supported in the REST interface?
> My last question also relates to querying for multiple versions of 
> columns -- the key question being "column(s)" plural.  as i mentioned 
> before, this query in the base shell works fine for getting the last 10 
> versions of a specific column...
>     get 'userdata', 'hossman', {COLUMN => 'hist:vote', VERSIONS => 10}
> ...but i can't seem to find any way to indicate that i want the last 10 
> versions of *all* the columns associated with the specified key -- in 
> either the REST interface or the hbase shell. I was particularly suprised 
> by this error...
>    get 'userdata', 'hossman', { VERSIONS => 10 }
> TypeError: can't convert Hash into String
>  from /var/opt/chrish-hadoop/hbase-0.19.0/bin/../bin/hirb.rb:326:in `get'
>  from /var/opt/chrish-hadoop/hbase-0.19.0/bin/../bin/hirb.rb:326:in `get'
>  from (hbase):47:in `binding'
> Maybe IRB bug!!
> ...and the fact that this query only produced the most recent values for 
> the specified columns (even though querying for either of them 
> individually with the VERSIONS=>10 option produced the full lsit for 
> each)...
>    get 
> 'userdata','hossman',{COLUMNS=>['hist:vote','hist:doc'],VERSIONS=>10}
> COLUMN                       CELL
>  hist:doc                    timestamp=1237842101205, value=2908
>  hist:vote                   timestamp=1237842101205, value=23
> 2 row(s) in 0.0360 seconds
> Obviously anything in the "user" family only has one version (because 
> that's the way the family was declared) but that's ok -- my goal is to get 
> whatever data is available going back up to 10 versions.  It's not so bad 
> if i have to execute two REST GETs: one for all of the current values in 
> the 'user' family, and one for the last 10 versions of all the values in 
> the 'hist' family; and it's not the end of the world if i have to 
> explicitly list all of the column names i want in each request -- but 
> making a seperate request for every column name that has multiple versions 
> seems like it could get prohibitive.
> Thanks in advance for any light people might be able to shed on these 
> questions.
> -Hoss

View raw message