hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Corgan <mcor...@hotpads.com>
Subject Re: Data size
Date Thu, 01 Apr 2010 22:00:40 GMT
Jonathan - thanks for the detailed answer.  Ii'm sure implementing this
stuff is a nightmare when trying minimize object instantiations.  But, since
you mentioned it had been discussed before, here's a concrete example to
throw some support behind non-duplication and prefix compression in future
releases.

At hotpads.com, we host about 4.5mm real estate listings and archive
statistics about them so that the homeowners can see how many people have
viewed their home.  Each listing has a compound string key like
"EquityResidential/VA45588438", and there are 26 integer statistics for each
listing that we aggregate on a daily basis.  The statistics are counts of
things like, displayed, previewed, viewed, mobileListingViewed,
contactInfoViewed, emailed, etc...  They're in a table called
DailyListingSummary.

- PK is something like "20100331/EquityResidential/VA45588438"
- average column name length is 17 bytes
-  we generate 100,000,000 rows per month.
- On MySQL, a typical row is 39 byte key + 26 integers = ~143 bytes
- 1 month of data is about 15 GB

We've been storing them in monthly partitioned MySQL tables, but schema
changes are nearly impossible, and write speed is obviously not great.
 We're considering moving several things to HBase, but the size inflation on
this style of data is brutal.  I could compress the tables for some big disk
savings, but my main concern is how much data fits in memory to serve user
queries quickly.  I don't mind wasting disk, but I'm pretty aggressive about
keeping stuff in memory.  I assume all the data is expanded in memory... is
that correct?

On HBase with ColumnFamily name of "ManagerEventCounts", each cell would be
~100 bytes, and each row would be ~2553 bytes.  That's 18x inflation, which
turns my 15gb monthly table into 255gb of raw data, and that's before
replication.

a) If each cell didn't have to store the ColumnFamily name, we're down to
2085b (208gb/month or 15x)
b) Stop duplicating the key and we're at 1110b (111gb/month or 8x)
c) My application could map the column names to 1 byte and we're at 679b
(67gb/month or 5x)
d) Prefix compression would be a huge improvement on my bulky primary keys
e) To shrink it further, I'd have to serialize rows into a single cell.
 (handle schema changes in my app, as well as forfeit hbase increment
functionality)

C and E need to be handled by my application, and D is probably difficult to
implement, but A and B seem like they could have good bang for the buck.  In
my case they shrink the data by 2.2x.  For workloads that judge the speed of
HBase by random lookups on bigger-than-memory data, the cache effectiveness
would be greatly improved.



2010/3/31 Jonathan Gray <jgray@facebook.com>

> There are many implications related to this.  The core trade-off as I see
> it is between storage and read performance.
>
> With the current setup, after we read blocks from HDFS into memory, we can
> just usher KeyValues straight out of the on-disk format and to the client
> without any further allocation or copies.  This is a highly desirable
> property.
>
> If we were to only keep what was absolutely necessary (could not be
> inferred or explicitly tracked in some way), then we would have to do a lot
> of work at read time to regenerate client-friendly data.
>
> I'm not sure exactly what you mean by storing the row length at the
> beginning of each row.  Families are certainly the easiest of these
> optimizations to make but change read behavior significantly.  It has been
> talked about and there's probably a jira hanging around somewhere.
>
> In the end, the HDFS/HBase philosophy is that disk/storage is cheap so we
> should do what we can (within reason) for read performance.
>
> Much of this is mitigated by the use of compression.  Currently we only
> utilize block compression (gzip default, lzo preferred).  BigTable uses a
> special prefix-compression which is ideal for this duplication issue; maybe
> one day we could do that too.
>
> JG
>
> > -----Original Message-----
> > From: Matt Corgan [mailto:mcorgan@hotpads.com]
> > Sent: Wednesday, March 31, 2010 7:06 PM
> > To: hbase-user@hadoop.apache.org
> > Cc: alex@cloudera.com; jlhuang@cs.nctu.edu.tw; kevin_hung@tsmc.com
> > Subject: Re: Data size
> >
> > Out of curiousity, why is it necessary to store the family and row with
> > every cell?  Aren't all the contents of a family confined to the same
> > file,
> > and couldn't a row length be stored at the beginning of each row or in
> > a
> > block index?  Is this true for values in the caches and memstore as
> > well?
> >
> > It could have drastic implications for storing rows with many small
> > values
> > but with long keys, long column names, and innocently verbose column
> > family
> > names.
> >
> > Matt
> >
> > 2010/3/31 alex kamil <alex.kamil@gmail.com>
> >
> > > i would also suggest to chk dfs.*replication* setting in hdfs (in
> > /conf/*
> > > hdfs*-site.xml)
> > >
> > > A-K
> > >
> > > 2010/3/31 Jean-Daniel Cryans <jdcryans@apache.org>
> > >
> > > > HBase is column-oriented; every cell is stored with the row,
> > family,
> > > > qualifier and timestamp so every pieces of data will bring a larger
> > > > disk usage. Without any knowledge of your keys, I can't comment
> > much
> > > > more.
> > > >
> > > > Then HDFS keeps a trash so every file compacted will end up
> > there...
> > > > if you just did the import, there will be a lot of these.
> > > >
> > > > Finally if you imported the data more than once, hbase keeps 3
> > > > versions by default.
> > > >
> > > > So in short, is it reasonable? Answer: it depends!
> > > >
> > > > J-D
> > > >
> > > > 2010/3/31  <y_823910@tsmc.com>:
> > > > > Hi,
> > > > >
> > > > > We've dumped oracele data to files then put these files into
> > different
> > > > > hbase table.
> > > > > The size of these files is 35G; we saw the HDFS usage up to 562G
> > after
> > > > > putting it into hbase.
> > > > > Is that reasonable?
> > > > > Thanks
> > > > >
> > > > >
> > > > >
> > > > > Fleming Chiu(邱宏明)
> > > > > 707-6128
> > > > > y_823910@tsmc.com
> > > > > 週一無肉日吃素救地球(Meat Free Monday Taiwan)
> > > > >
> > > > >
> > > > >
> > > >
> > >  --------------------------------------------------------------------
> > -------
> > > > >                                                         TSMC
> > PROPERTY
> > > > >  This email communication (and any attachments) is proprietary
> > > > information
> > > > >  for the sole use of its
> > > > >  intended recipient. Any unauthorized review, use or distribution
> > by
> > > > anyone
> > > > >  other than the intended
> > > > >  recipient is strictly prohibited.  If you are not the intended
> > > > recipient,
> > > > >  please notify the sender by
> > > > >  replying to this email, and then delete this email and any
> > copies of
> > > it
> > > > >  immediately. Thank you.
> > > > >
> > > >
> > >  --------------------------------------------------------------------
> > -------
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message