hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Miguel Costa <miguel-co...@telecom.pt>
Subject RE: HBase design schema
Date Mon, 04 Apr 2011 21:57:41 GMT

Thanks for all your help.

I will try your solutions. I also saw this link
http://static.last.fm/johan/huguk-20090414/fredrik-hypercubes-in-hbase.pdf.

I will try OpenTSDB and maybe Zhomg


  
Miguel 





-----Original Message-----
From: Peter Haidinyak [mailto:phaidinyak@local.com] 
Sent: segunda-feira, 4 de Abril de 2011 19:24
To: user@hbase.apache.org
Subject: RE: HBase design schema

I've done almost the same thing at my work. Since I'm running on a VERY
small number of servers (2), I pre-aggregate my data into tables in the
format...

[YYYY-MM-DD]|[Keyword]|[Referrer]  for the row key

And then for the data column I store the hit count for that referrer. This
approach has a problem during insert because having the date at the front of
the key is usually goes to one server. The upside is that during a client
scan you can set the start and end row, such as startRow =
'2011-03-05|hospital| ' and the End Row as  endRow = '2011-03-05|hospital|~'
this will return all of the referrers for the keyword hospital for the date
of 2011-03-05.

YMMV

-Pete

From: Miguel Costa [mailto:miguel-costa@telecom.pt]
Sent: Monday, April 04, 2011 9:12 AM
To: user@hbase.apache.org
Subject: HBase design schema

Hi,

I need some help to a schema design on HBase.

I have 5 dimensions (Time,Site,Referrer Keyword,Country).
My row key is Site+Time.

Now I want to answer some questions like what is the top Referrer by Keyword
for a site on a Period of Time.
Basically I want to cross all the dimensions that I have. And if I have 30
dimensions?

What is the best schema design.

Please let me know  if this isn't the right mailing list.

Thank you for your time.

Miguel



Mime
View raw message