hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject RE: HBase schema option consideration
Date Thu, 15 Apr 2010 17:40:03 GMT

Your formatting seems to have gotten messed up.

Why do you want to separate this out in to multiple column families instead of a single column
family in a table?
Table: BLOGO
Row ID: blog_id
Column Family: blogs
Columns in blogs:

Then you get everything in one fetch.

Or am I missing something?


> Date: Thu, 15 Apr 2010 12:32:48 -0500
> Subject: HBase schema option consideration
> From: nkapshoo@gmail.com
> To: hbase-user@hadoop.apache.org
> This is an HBase schema design question. Suppose I store blog enty details
> in a HBase table:
> blogid, blog_content, blog_author, blog_subject.
> My query is such that it always retrieves all this data at the same time.
> So is it a better idea to store all this in a single json/protobuf object or
> actually separate out the details into column families?
> Option1:
> Table          RowKey          Column Family          Value
> Blogs          BlogId                   Details                JSON(Content,
> Author, Subject)
> Option2:
> Table          RowKey          Column Family
> Blogs           BlogId                   Content
>                                                 Author
>                                                 Subject
> I was thinking of option1 because it seems it might be faster since all
> details will be physically stored together. But option2 is what seems to be
> the trend when I look at other basic HBase schema examples out there.
> Please let me know opinions and if I am on the right track...
> Thanks in advance.
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message