hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Wide Rows vs Multiple column families
Date Mon, 29 Sep 2014 17:07:56 GMT
bq. had to spawn multiple put requests in this case because there is no API
for sending insert requests to multiple column family.

Could this be related to the slowdown you observed ?

Are you able to use HTable API and see if the same slowdown is reproduced ?

BTW try 0.98.6.1 if you can.

Cheers

On Mon, Sep 29, 2014 at 10:00 AM, Nishanth S <nishanth.2884@gmail.com>
wrote:

> Hbase Release: 0.96.1
> Number of column families at which issue is observed is 2.Earlier I had one
> single column family  where all the data was persisted.In the new case I
> was  storing all meta data into column family 1(less than 1k) and  a blob
> on second column family(around 7Kb).
> We have 9 node cluster with 7 hbase region servers and using hadoop 2.3.0.
>
> I am also using asynch hbase client 1.5  for  ingesting data into hbase.I
> had to spawn multiple put requests in this case because there is no API for
> sending insert requests to multiple column family.
>
> Thanks,
> Nishan
>
>
> On Mon, Sep 29, 2014 at 10:49 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > Can you give a bit more detail, such as:
> >
> > the release of HBase you're using
> > number of column families where slowdown is observed
> > size of cluster
> > release of hadoop you're using
> >
> > Thanks
> >
> > On Mon, Sep 29, 2014 at 9:43 AM, Nishanth S <nishanth.2884@gmail.com>
> > wrote:
> >
> > > Hey  Ted,
> > >
> > > I was  in the process of comparing   insert throughputs   which we
> > > discussed using ycsb.What I could find is that when I split the data
> into
> > > multiple column families  the insert through  is  coming down to half
> > when
> > > compared to persisting into a single column family.Do you think this is
> > > possible or am I doing some thing wrong.
> > >
> > > -Nishan
> > >
> > > On Thu, Sep 25, 2014 at 11:56 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> > >
> > > > There should not be impact to hbase write performance for two column
> > > > families.
> > > >
> > > > Cheers
> > > >
> > > > On Thu, Sep 25, 2014 at 10:53 AM, Nishanth S <
> nishanth.2884@gmail.com>
> > > > wrote:
> > > >
> > > > > Thank you Ted.No I do not  plan to use bulk loading  since the data
> > is
> > > > >  incremental in nature.
> > > > >
> > > > > On Thu, Sep 25, 2014 at 11:36 AM, Ted Yu <yuzhihong@gmail.com>
> > wrote:
> > > > >
> > > > > > For #1, do you plan to use bulk load ?
> > > > > >
> > > > > > For #3, take a look at HBASE-5416 which introduced essential
> column
> > > > > family.
> > > > > > In your query, you can designate the smaller column family as
> > > essential
> > > > > > column family where smaller columns are queried.
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > > On Thu, Sep 25, 2014 at 9:57 AM, Nishanth S <
> > nishanth.2884@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > This question may have been asked many times  but I would
> really
> > > > > > appreciate
> > > > > > > if some one can help me  on how to go about this.
> > > > > > >
> > > > > > >
> > > > > > > Currently my hbase table consists of about 10 columns per
row
> > > which
> > > > in
> > > > > > > total has an average size of  5K.The chunk of the size
is held
> by
> > > > one
> > > > > > > particular column(more than 4K).Would it help to move 
this
> > column
> > > > out
> > > > > > to a
> > > > > > > different column family when we do reads.There are cases
where
> we
> > > > just
> > > > > > need
> > > > > > > to access the smaller columns and  there is another set
of use
> > > cases
> > > > > > where
> > > > > > > you need both the data(the one in smaller column and this
huge
> > data
> > > > > > > chunk).In general I am trying to answer the below questions
in
> > this
> > > > > > > scenario.
> > > > > > >
> > > > > > >
> > > > > > > 1.Would seperating to multiple column families affect 
hbase
> > write
> > > > > > > performance?
> > > > > > >
> > > > > > > 2. How would if affect my read performance considering
both the
> > > read
> > > > > > cases?
> > > > > > >
> > > > > > > 3.Is there any advantage that I am gaining by seperating
into
> > > > multiple
> > > > > > cfs?
> > > > > > >
> > > > > > >
> > > > > > > I would really appreciate if any one could  point me in
the
> right
> > > > > > > direction.
> > > > > > >
> > > > > > >
> > > > > > > -Thanks
> > > > > > > Nishan
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message