hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject RE: HBase schema option consideration
Date Thu, 15 Apr 2010 17:40:03 GMT


Your formatting seems to have gotten messed up.

Why do you want to separate this out in to multiple column families instead of a single column
family in a table?
Table: BLOGO
Row ID: blog_id
Column Family: blogs
Columns in blogs:
blog_posted_ts,
blog_author,
blog_subject,
blog_content

Then you get everything in one fetch.


Or am I missing something?

-Mike

> Date: Thu, 15 Apr 2010 12:32:48 -0500
> Subject: HBase schema option consideration
> From: nkapshoo@gmail.com
> To: hbase-user@hadoop.apache.org
> 
> This is an HBase schema design question. Suppose I store blog enty details
> in a HBase table:
> blogid, blog_content, blog_author, blog_subject.
> 
> My query is such that it always retrieves all this data at the same time.
> 
> So is it a better idea to store all this in a single json/protobuf object or
> actually separate out the details into column families?
> 
> Option1:
> 
> Table          RowKey          Column Family          Value
> Blogs          BlogId                   Details                JSON(Content,
> Author, Subject)
> 
> Option2:
> 
> Table          RowKey          Column Family
> Blogs           BlogId                   Content
>                                                 Author
>                                                 Subject
> 
> 
> I was thinking of option1 because it seems it might be faster since all
> details will be physically stored together. But option2 is what seems to be
> the trend when I look at other basic HBase schema examples out there.
> 
> Please let me know opinions and if I am on the right track...
> 
> Thanks in advance.
 		 	   		  
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message