trafodion-codereview mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DaveBirdsall <...@git.apache.org>
Subject [GitHub] incubator-trafodion pull request #1246: [TRAFODION-2645] First draft of a re...
Date Wed, 27 Sep 2017 21:59:26 GMT
GitHub user DaveBirdsall opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/1246

    [TRAFODION-2645] First draft of a rewrite of the MDAM costing code

    This set of changes is a first draft of a rewrite of the MDAM costing code.
    
    The rewritten code uses a model much more closely aligned to how the MDAM run-time works.
It estimates the number of MDAM probes and fetches directly. I/O cost is estimated differently.
I/O cost is not additive across disjuncts, because the more parts of a file that are touched,
the more like sequential I/O matters become. On the other hand, the cost of an HBase scan
(that is, a begin-key/end-key subset in executor terms) is significant, and its contribution
to cost is additive. A knob, MDAM_SUBSET_FACTOR, has been added to tune that cost.
    
    The cost formulas used to determine optimal disjunct prefix are as close as possible to
the cost formula used to cost the MDAM scan as a whole. The only thing left out in the former
is the I/O cost, as that is not additive. In contrast, in the old code, the costing formulas
used for optimal disjunct prefix are quite different than that used for the scan as a whole,
and it is hard to see their relationship.
    
    I have done a performance test of the test bed in JIRA TRAFODION-1641, using old and new
costing code, and forcing both serial and parallel MDAM plans of various depths, and also
simple scan plans. The new code aggregate execution time over that test bed is about 6% better
than the old. So the code seems to be at least as good as the old. The new code picks the
optimal plan more frequently than the old. There are about eight queries (out of 92) where
the old code picks a better plan than the new code.
    
    There is still some testing work to be done on this code. Costing of the inner table of
a nested join has not been fully explored yet.
    
    In this check-in, the new costing code is turned off by default. Use CQD MDAM_COSTING_REWRITE
'ON' to turn on the new costing code.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/DaveBirdsall/incubator-trafodion MDAMCostingRewrite

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/1246.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1246
    
----
commit fe2a6f616177ff16475ba9d64b1c88c1014138eb
Author: Dave Birdsall <dbirdsall@apache.org>
Date:   2017-09-27T21:49:12Z

    [TRAFODION-2645] First draft of a rewrite of the MDAM costing code

----


---

Mime
View raw message