trafodion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Birdsall <dave.birds...@esgyn.com>
Subject RE: investigating MDAM
Date Thu, 25 Feb 2016 22:58:04 GMT
Hi Eric,

MDAM uses "probes" to materialize the next value of a key column. It only
needs to read one row to do that.

MDAM uses "fetches" to read key ranges of data that satisfy the MDAM
disjuncts. Those key ranges may be larger, so a larger cache size might be
warranted there.

We had some issues in the past where probes were using large cache sizes.
That is, we'd read, say, 10000 rows from HBase into a buffer when one row
would do.

The run-time code for MDAM is in several places.

Check out method keyMdamEx::getNextKeyRange in executor/ex_mdam.cpp. That is
the method that returns the next key range to a scan tcb. The return code
tells the scan tcb if the key range being returned is for a "probe" or a
"fetch". Many tcbs potentially could use MDAM; by putting a breakpoint at
this method and looking up the stack you can find the code in the tcb that
you're interested in.

Dave

-----Original Message-----
From: Eric Owhadi [mailto:eric.owhadi@esgyn.com]
Sent: Thursday, February 25, 2016 2:40 PM
To: dev@trafodion.incubator.apache.org
Subject: investigating MDAM

Hi Trafodioneers,

Can someone explain or point me to the source file that is handling mdam
probing?

I think I found anomaly on cache size and small scanner when using MDAM, but
would like to make sure, therefore would like to understand how  the probing
works in MDAM…

Thanks in advance for the help,

Eric



FYI, my test shows 4010  IO to go over 1000 MDAM ranges:

So 4 IO per range: I am assuming

-Open scanner?

-Fetch?

-Close scanner?

-Then another one for probing? I would assume it is a get, is it?



So the problem I am seeing is that both cachesize and small scanner logic
are not taking into account the split of scan into smaller chunk with MDAM.
So I am searching if compiler knows in advance how many chunk MDAM will end
up using? Is there a tdb value that is carrying this info?

The consequence of this is that we are using higher cache size than needed,
and we are not benefiting from small scanner faster speed/lower IO with
MDAM.





See below:



create table t132helper (a int not null, primary key(a));

insert into t132helper values(1);

create table t132 (k1 int not null, k2 int not null, a int not null, b int
not null,

     c
char(1000),
primary key (k1,k2)) ATTRIBUTES ALIGNED FORMAT ;

upsert using load

into t132

  select  x1000*1000+ x10000*10000 + x100000*100000,

          x1+x10*10+x100*100,

          x1+x10*10+x100*100+ x1000*1000+ x10000*10000 + x100000*100000,

          x1+x10*10+x100*100+ x1000*1000+ x10000*10000 + x100000*100000,

          'yo bro'



  from t132helper

transpose 0,1,2,3,4,5,6,7,8,9 as x1

transpose 0,1,2,3,4,5,6,7,8,9 as x10

transpose 0,1,2,3,4,5,6,7,8,9 as x100

transpose 0,1,2,3,4,5,6,7,8,9 as x1000

transpose 0,1,2,3,4,5,6,7,8,9 as x10000

transpose 0,1,2,3,4,5,6,7,8,9 as x100000;

update statistics for table t132 on every column;



select count(*) from t132 where k2 between 10 and 21;

Mime
View raw message