Hi Eric,
You can set the CQD Hbase_index_level to a positive integer value (such as
2) to completely bypass the JNI call. Note that the JNI call also reads in
the index level, which is a function of the Hfiles in use.
Since we cache NATable, which is an in-memory representation of a HBase
table, the JNI cost is only incurred once for all references to a table,
per SQL session. If the table is redefined, we will re-read the definition
and call the JNI again.
Thanks --Qifan
On Wed, Aug 5, 2015 at 5:47 PM, Eric Owhadi <eric.owhadi@esgyn.com> wrote:
> For the small scanner feature, I first hardcoded the default HBaseBlock
> size (64KB) in the check to see if small scanner is good candidate to
> enable or not.
>
> Then struggling with the regression tests and the modifications needed to
> pass, I realized that it is better to have a feature totally completed to
> avoid the cost of dealing with regression every time you want to improve
> it.
>
> So I am trying to remove the hardcode for HBase block size, and realized
> there is a function getHbaseTableInfo that will do the job of getting HBase
> block size and index level. BUT, this function is doing a JNI call to the
> HBase layer. Look like expensive to call for every scan query… will likely
> lose the gain of small scanner if I put this call in the middle.
>
> I was wondering if this Hbase block size per table is not already cached in
> memory so that I can access it without fear of performance cost? I guess it
> is table meta data, and I think this is supposed to already be cached if I
> recall a previous email thread?
>
> Thanks in advance for the help,
>
> Eric
>
--
Regards, --Qifan
|