hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Miguel Costa <miguel-co...@telecom.pt>
Subject RE: HBase design schema
Date Mon, 04 Apr 2011 21:57:41 GMT

Thanks for all your help.

I will try your solutions. I also saw this link

I will try OpenTSDB and maybe Zhomg


-----Original Message-----
From: Peter Haidinyak [mailto:phaidinyak@local.com] 
Sent: segunda-feira, 4 de Abril de 2011 19:24
To: user@hbase.apache.org
Subject: RE: HBase design schema

I've done almost the same thing at my work. Since I'm running on a VERY
small number of servers (2), I pre-aggregate my data into tables in the

[YYYY-MM-DD]|[Keyword]|[Referrer]  for the row key

And then for the data column I store the hit count for that referrer. This
approach has a problem during insert because having the date at the front of
the key is usually goes to one server. The upside is that during a client
scan you can set the start and end row, such as startRow =
'2011-03-05|hospital| ' and the End Row as  endRow = '2011-03-05|hospital|~'
this will return all of the referrers for the keyword hospital for the date
of 2011-03-05.



From: Miguel Costa [mailto:miguel-costa@telecom.pt]
Sent: Monday, April 04, 2011 9:12 AM
To: user@hbase.apache.org
Subject: HBase design schema


I need some help to a schema design on HBase.

I have 5 dimensions (Time,Site,Referrer Keyword,Country).
My row key is Site+Time.

Now I want to answer some questions like what is the top Referrer by Keyword
for a site on a Period of Time.
Basically I want to cross all the dimensions that I have. And if I have 30

What is the best schema design.

Please let me know  if this isn't the right mailing list.

Thank you for your time.


View raw message