hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Wide Rows vs Multiple column families
Date Mon, 29 Sep 2014 16:49:41 GMT
Can you give a bit more detail, such as:

the release of HBase you're using
number of column families where slowdown is observed
size of cluster
release of hadoop you're using

Thanks

On Mon, Sep 29, 2014 at 9:43 AM, Nishanth S <nishanth.2884@gmail.com> wrote:

> Hey  Ted,
>
> I was  in the process of comparing   insert throughputs   which we
> discussed using ycsb.What I could find is that when I split the data into
> multiple column families  the insert through  is  coming down to half when
> compared to persisting into a single column family.Do you think this is
> possible or am I doing some thing wrong.
>
> -Nishan
>
> On Thu, Sep 25, 2014 at 11:56 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > There should not be impact to hbase write performance for two column
> > families.
> >
> > Cheers
> >
> > On Thu, Sep 25, 2014 at 10:53 AM, Nishanth S <nishanth.2884@gmail.com>
> > wrote:
> >
> > > Thank you Ted.No I do not  plan to use bulk loading  since the data is
> > >  incremental in nature.
> > >
> > > On Thu, Sep 25, 2014 at 11:36 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> > >
> > > > For #1, do you plan to use bulk load ?
> > > >
> > > > For #3, take a look at HBASE-5416 which introduced essential column
> > > family.
> > > > In your query, you can designate the smaller column family as
> essential
> > > > column family where smaller columns are queried.
> > > >
> > > > Cheers
> > > >
> > > > On Thu, Sep 25, 2014 at 9:57 AM, Nishanth S <nishanth.2884@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > This question may have been asked many times  but I would really
> > > > appreciate
> > > > > if some one can help me  on how to go about this.
> > > > >
> > > > >
> > > > > Currently my hbase table consists of about 10 columns per row
> which
> > in
> > > > > total has an average size of  5K.The chunk of the size is held by
> > one
> > > > > particular column(more than 4K).Would it help to move  this column
> > out
> > > > to a
> > > > > different column family when we do reads.There are cases where we
> > just
> > > > need
> > > > > to access the smaller columns and  there is another set of use
> cases
> > > > where
> > > > > you need both the data(the one in smaller column and this huge data
> > > > > chunk).In general I am trying to answer the below questions in this
> > > > > scenario.
> > > > >
> > > > >
> > > > > 1.Would seperating to multiple column families affect  hbase write
> > > > > performance?
> > > > >
> > > > > 2. How would if affect my read performance considering both the
> read
> > > > cases?
> > > > >
> > > > > 3.Is there any advantage that I am gaining by seperating into
> > multiple
> > > > cfs?
> > > > >
> > > > >
> > > > > I would really appreciate if any one could  point me in the right
> > > > > direction.
> > > > >
> > > > >
> > > > > -Thanks
> > > > > Nishan
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message