hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Data size
Date Thu, 01 Apr 2010 01:18:45 GMT
HBase is column-oriented; every cell is stored with the row, family,
qualifier and timestamp so every pieces of data will bring a larger
disk usage. Without any knowledge of your keys, I can't comment much
more.

Then HDFS keeps a trash so every file compacted will end up there...
if you just did the import, there will be a lot of these.

Finally if you imported the data more than once, hbase keeps 3
versions by default.

So in short, is it reasonable? Answer: it depends!

J-D

2010/3/31  <y_823910@tsmc.com>:
> Hi,
>
> We've dumped oracele data to files then put these files into different
> hbase table.
> The size of these files is 35G; we saw the HDFS usage up to 562G after
> putting it into hbase.
> Is that reasonable?
> Thanks
>
>
>
> Fleming Chiu(邱宏明)
> 707-6128
> y_823910@tsmc.com
> 週一無肉日吃素救地球(Meat Free Monday Taiwan)
>
>
>  ---------------------------------------------------------------------------
>                                                         TSMC PROPERTY
>  This email communication (and any attachments) is proprietary information
>  for the sole use of its
>  intended recipient. Any unauthorized review, use or distribution by anyone
>  other than the intended
>  recipient is strictly prohibited.  If you are not the intended recipient,
>  please notify the sender by
>  replying to this email, and then delete this email and any copies of it
>  immediately. Thank you.
>  ---------------------------------------------------------------------------
>
>
>
>

Mime
View raw message