hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anil <anilk...@gmail.com>
Subject Re: Scan a region in parallel
Date Fri, 21 Oct 2016 07:56:22 GMT
Thank you Ram.

"So now  you are spawning those many scan threads equal to the number of
regions " - YES

There are two ways of scanning region in parallel

1. scan a region with start row and stop row in parallel with single scan
operation on server side and hbase take care of parallelism internally.
2. transform a start row and stop row of a region into number of start and
stop rows (by some criteria) and span scan query for each start and stop

#1 is not supported (as you also said).

i am looking for #2. i checked the phoenix documentation and code. it seems
to me that phoenix is doing #2. i looked into phoenix code and could not
understand it completely.

The usecase is very simple. Hbase not good (at least in terms of
performance for OLTP) query by all columns (other than row key) and sorting
of all columns of a row. even phoenix too.

So i am planning load the hbase/phoenix table into in-memory data base for
faster access.

scanning of big region sequentially will lead to larger load time. so
finding ways to minimize the load time.

Hope this helps.


On 21 October 2016 at 09:30, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> Hi Anil
> So now  you are spawning those many scan threads equal to the number of
> regions.
> bq.Is there any way to scan a region in parallel ?
> You mean with in a region you want to scan parallely? Which means that a
> single query you want to split up into N number of small scans and read and
> aggregate on the client side/server side?
> Currently you cannot do that. Once you set a start and stoprow the scan
> will determine which region it belongs to and retrieves the data
> sequentially in that region (it applies the filtering that you do during
> the course of the scan).
> Have you tried Apache Phoenix?  Its a SQL wrapper over HBase and there you
> could do parallel scans for a given SQL query if there are some guide posts
> collected. Such things cannot be an integral part of HBase. But I fear as I
> am not aware of your usecase we cannot suggest on this.
> REgards
> Ram
> On Fri, Oct 21, 2016 at 8:40 AM, Anil <anilklce@gmail.com> wrote:
> > Any pointers ?
> >
> > On 20 October 2016 at 18:15, Anil <anilklce@gmail.com> wrote:
> >
> > > HI,
> > >
> > > I am loading hbase table into an in-memory db to support filter,
> ordering
> > > and pagination.
> > >
> > > I am scanning region and inserting data into in-memory db. each region
> > > scan is done in single thread so each region is scanned in parallel.
> > >
> > > Is there any way to scan a region in parallel ? any pointers would be
> > > helpful.
> > >
> > > Thanks
> > >
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message