hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang Zhang <zhang.yang...@gmail.com>
Subject Re: Scan problem
Date Wed, 21 Mar 2018 11:09:31 GMT
Thanks all of you,  and your answer help me a lot.

2018-03-19 22:31 GMT+08:00 Saad Mufti <saad.mufti@gmail.com>:

> Another option if you have enough disk space/off heap memory space is to
> enable bucket cache to cache even more of your data, and set the
> PREFETCH_ON_OPEN => true option on the column families you want always
> cache. That way HBase will prefetch your data into the bucket cache and
> your scan won't have that initial slowdown. Or if you want to do it
> globally for all column families, set the configuration flag
> "hbase.rs.prefetchblocksonopen" to "true". Keep in mind though that if you
> do this, you should either have enough bucket cache space for all your
> data, otherwise there will be a lot of useless eviction activity at HBase
> startup and even later.
>
> Also, where a region is located will also be heavily impacted by which
> region balancer you have chosen and how you have tuned it in terms of how
> often to run and other parameters. A split region will stay initially at
> least on the same region server but your balancer if and when run can move
> it (an indeed any region) elsewhere to satisfy its criteria.
>
> Cheers.
>
> ----
> Saad
>
>
> On Mon, Mar 19, 2018 at 1:14 AM, ramkrishna vasudevan <
> ramkrishna.s.vasudevan@gmail.com> wrote:
>
> > Hi
> >
> > First regarding the scans,
> >
> > Generally the data resides in the store files which is in HDFS. So
> probably
> > the first scan that you are doing is reading from HDFS which involves
> disk
> > reads. Once the blocks are read, they are cached in the Block cache of
> > HBase. So your further reads go through that and hence you see further
> > speed up in the scans.
> >
> > >> And another question about region split, I want to know which
> > RegionServer
> > will load the new region afther splited ,
> > Will they be the same One with the old region?
> > Yes . Generally same region server hosts it.
> >
> > In master the code is here,
> > https://github.com/apache/hbase/blob/master/hbase-
> > server/src/main/java/org/apache/hadoop/hbase/master/assignment/
> > SplitTableRegionProcedure.java
> >
> > You may need to understand the entire flow to know how the regions are
> > opened after a split.
> >
> > Regards
> > Ram
> >
> > On Sat, Mar 17, 2018 at 9:02 PM, Yang Zhang <zhang.yang.dm@gmail.com>
> > wrote:
> >
> > > Hello everyone
> > >
> > >         I try to do many Scan use RegionScanner in coprocessor, and
> > ervery
> > > time ,the first Scan cost  about 10 times than the other,
> > > I don't know why this will happen
> > >
> > > OneBucket Scan cost is : 8794 ms Num is : 710
> > > OneBucket Scan cost is : 91 ms Num is : 776
> > > OneBucket Scan cost is : 87 ms Num is : 808
> > > OneBucket Scan cost is : 105 ms Num is : 748
> > > OneBucket Scan cost is : 68 ms Num is : 200
> > >
> > >
> > > And another question about region split, I want to know which
> > RegionServer
> > > will load the new region afther splited ,
> > > Will they be the same One with the old region?  Anyone know where I can
> > > find the code to learn about that?
> > >
> > >
> > > Thanks for your help
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message