hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From schubert zhang <zson...@gmail.com>
Subject Re: Many columns in 0.19
Date Wed, 11 Mar 2009 03:06:09 GMT
Cool, the HFile solution is what mentioned in Paper of Bigtable, it will be
more efficient than MapFile.We are looking forward 0.20.0, including Bloom
Filter.
Thanks.

On Wed, Mar 11, 2009 at 2:28 AM, Jonathan Gray <jlist@streamy.com> wrote:

> Aseem,
>
> Almost!
>
> You will have 10 HStores as you say.  Each of those HStores is made up of a
> single Memcache instance and zero or many MapFiles on HDFS.  Default block
> size in HDFS is 64MB not 64k, so it could be a single block or many.
>
> Writes are done into the Memcache.  That is periodically flushed to HDFS
> creating a single HStoreFile.  Multiple flushes will then yield multiples
> HSFs.  Compactions and major compactions are run periodically to combine
> these files into a single HStoreFile, for efficiency.
>
> In the upcoming 0.20 release we will move to a new HDFS file format called
> HFile.  Within HFile, our data will be broken up into ~64k blocks
> (configurable) but still stored in HDFS in 64M blocks (again,
> configurable).
>
> JG
>
> > -----Original Message-----
> > From: Puri, Aseem [mailto:Aseem.Puri@Honeywell.com]
> > Sent: Monday, March 09, 2009 9:34 PM
> > To: hbase-user@hadoop.apache.org
> > Subject: RE: Many columns in 0.19
> >
> > Hi
> >
> > Thanks for help.
> >
> > So it means for a table if there are 10 column families then there are
> > 10 HStore in a region and corresponding to it there are 10 map files.
> > Mapfile further have blocks inside it of 64K are stored by HDFS.
> >
> > Am I right?
> >
> > -Aseem Puri
> >
> >
> >
> > -----Original Message-----
> > From: Jonathan Gray [mailto:jlist@streamy.com]
> > Sent: Monday, March 09, 2009 7:24 PM
> > To: hbase-user@hadoop.apache.org
> > Subject: RE: Many columns in 0.19
> >
> > A Table is made up of 1 to N HRegions and defined by its Column
> > Families.
> >
> > Each HRegion is made up of an HStore per column family.  Each HStore is
> > then
> > made up of a single Memcache and 0 to M HStoreFiles.
> >
> > So, the HStore is one column family in one region.  It houses that
> > families
> > Memcache and HStoreFiles for that particular region.
> >
> > And yes, Bigtable stores one family of a region in one SSTable.  The
> > only
> > caveat to that is that they offer "Locality Groups", as mentioned by
> > Ryan,
> > that group different families together in a single SSTable (or HStore
> > in
> > our
> > case).  Changes in 0.20 leave the door open for HBase to also implement
> > them
> > but it is not currently on the roadmap.
> >
> > Hope that helps.
> >
> > JG
> >
> > > -----Original Message-----
> > > From: Puri, Aseem [mailto:Aseem.Puri@Honeywell.com]
> > > Sent: Monday, March 09, 2009 3:22 AM
> > > To: hbase-user@hadoop.apache.org
> > > Subject: RE: Many columns in 0.19
> > >
> > >
> > > Hi
> > >
> > > I was reading Google BigTable article. Many thing oh hbase are
> > similar
> > > to Bigatable. But I cant understand the concept of HStore. Is HStore
> > > means one column family in one map file?
> > >
> > > Is BigTable also store one column family in one SStable?
> > >
> > > -Aseem
> > >
> > > -----Original Message-----
> > > From: Ryan Rawson [mailto:ryanobjc@gmail.com]
> > > Sent: Monday, March 09, 2009 3:20 PM
> > > To: hbase-user@hadoop.apache.org
> > > Subject: Re: Many columns in 0.19
> > >
> > > Don't forget, each column family is another file on disk, and file
> > > open.
> > > Every column family is stored in it's own mapfile, and that increases
> > > the
> > > load on HDFS.
> > >
> > > This particular restriction won't ever really go away (unless we
> > > introduce
> > > locality groups, even then, each locality group = N families = 1
> > file),
> > > but
> > > in 0.20 it should be more feasable to have thousands of columns per
> > > family,
> > > or more.
> > >
> > > -ryan
> > >
> > > On Mon, Mar 9, 2009 at 1:47 AM, Michael Dagaev
> > > <michael.dagaev@gmail.com>wrote:
> > >
> > > > Thank you, Ryan
> > > >
> > > > On Mon, Mar 9, 2009 at 10:28 AM, Ryan Rawson <ryanobjc@gmail.com>
> > > wrote:
> > > > > Sadly this is still a limit.
> > > > >
> > > > > 0.20 should make things much better.
> > > > >
> > > > > -ryan
> > > > >
> > > > > On Mon, Mar 9, 2009 at 12:23 AM, Michael Dagaev <
> > > > michael.dagaev@gmail.com>wrote:
> > > > >
> > > > >> Hi , all
> > > > >>
> > > > >>    I remember it was not recommended to add many columns (column
> > > > >> qualifiers) in Hbase 0.18
> > > > >> Does Hbase 0.19.0 still have this limitation?
> > > > >>
> > > > >> Thank you for your cooperation,
> > > > >> M.
> > > > >>
> > > > >
> > > >
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message