calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Volodymyr Vysotskyi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CALCITE-2166) Cumulative cost of RelSubset.best RelNode is increased after calling RelSubset.propagateCostImprovements() for input RelNodes
Date Mon, 05 Feb 2018 17:01:00 GMT
Volodymyr Vysotskyi created CALCITE-2166:
--------------------------------------------

             Summary: Cumulative cost of RelSubset.best RelNode is increased after calling
RelSubset.propagateCostImprovements() for input RelNodes
                 Key: CALCITE-2166
                 URL: https://issues.apache.org/jira/browse/CALCITE-2166
             Project: Calcite
          Issue Type: Bug
          Components: core
    Affects Versions: 1.15.0
            Reporter: Volodymyr Vysotskyi
            Assignee: Julian Hyde


After calling {{RelSubset.propagateCostImprovements()}} cumulative cost of {{RelSubset.best}}
{{RelNode}} may be increased due to the increase of the non-cumulative cost caused by changing
of input best {{RelNode}}.
To observe this issue, add this code:
{code:java}
          if (subset.best != null) {
            RelOptCost bestCost = getCost(subset.best, RelMetadataQuery.instance());
            if (!subset.bestCost.equals(bestCost)) {
              throw new AssertionError(
                "relSubset [" + subset.getDescription()
                  + "] has wrong best cost "
                  + subset.bestCost + ". Correct cost is " + bestCost);
            }
          }
{code}
into {{VolcanoPlanner.validate()}} method (line 907).
List of unit tests which fail with this check:
{noformat}
Failed tests: 
  MaterializationTest.testJoinMaterializationUKFK9:1823->checkMaterialize:198->checkMaterialize:205->checkThatMaterialize:233
relSubset [rel#226287:Subset#8.ENUMERABLE.[]] has wrong best cost {221.5 rows, 128.25 cpu,
0.0 io}. Correct cost is {233.0 rows, 178.0 cpu, 0.0 io}
  ScannableTableTest.testPFPushDownProjectFilterAggregateNested:279 relSubset [rel#12950:Subset#5.ENUMERABLE.[]]
has wrong best cost {63.8 rows, 62.308 cpu, 0.0 io}. Correct cost is {70.4 rows, 60.404 cpu,
0.0 io}
  ScannableTableTest.testPFTableRefusesFilterCooperative:221 relSubset [rel#13382:Subset#2.ENUMERABLE.[]]
has wrong best cost {81.0 rows, 181.01 cpu, 0.0 io}. Correct cost is {150.5 rows, 250.505
cpu, 0.0 io}
  ScannableTableTest.testProjectableFilterableCooperative:148 relSubset [rel#13611:Subset#2.ENUMERABLE.[]]
has wrong best cost {81.0 rows, 181.01 cpu, 0.0 io}. Correct cost is {150.5 rows, 250.505
cpu, 0.0 io}
  ScannableTableTest.testProjectableFilterableNonCooperative:165 relSubset [rel#13754:Subset#2.ENUMERABLE.[]]
has wrong best cost {81.0 rows, 181.01 cpu, 0.0 io}. Correct cost is {150.5 rows, 250.505
cpu, 0.0 io}
  FrameworksTest.testUpdate:336->executeQuery:367 relSubset [rel#22533:Subset#2.ENUMERABLE.any]
has wrong best cost {19.5 rows, 37.75 cpu, 0.0 io}. Correct cost is {22.575 rows, 52.58 cpu,
0.0 io}
{noformat}
For the test {{MaterializationTest.testJoinMaterializationUKFK9}} initial best plan was:
{noformat}
EnumerableProject(empid0=[$5], empid00=[$5], deptno0=[$7]): rowcount = 15.0, cumulative cost
= {15.0 rows, 45.0 cpu, 0.0 io}, id = 3989
  EnumerableJoin(subset=[rel#3988:Subset#34.ENUMERABLE.[]], condition=[=($1, $7)], joinType=[inner]):
rowcount = 15.0, cumulative cost = {116.0 rows, 0.0 cpu, 0.0 io}, id = 4797
    EnumerableFilter(subset=[rel#4274:Subset#47.ENUMERABLE.[0]], condition=[=(CAST($2):VARCHAR
CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary", 'Bill')]): rowcount = 1.0,
cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io}, id = 16522
      EnumerableTableScan(subset=[rel#158:Subset#11.ENUMERABLE.[0]], table=[[hr, m0]]): rowcount
= 1.0, cumulative cost = {0.0 rows, 1.0 cpu, 0.0 io}, id = 79
    EnumerableTableScan(subset=[rel#115:Subset#5.ENUMERABLE.[]], table=[[hr, depts]]): rowcount
= 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 62
{noformat}
Its cumulative cost is \{221.5 rows, 123.75 cpu, 0.0 io}

After applying some rules it became:
{noformat}
EnumerableProject(empid0=[$3], empid00=[$3], deptno0=[$0]): rowcount = 2.25, cumulative cost
= {2.25 rows, 6.75 cpu, 0.0 io}, id = 4012
  EnumerableFilter(subset=[rel#4007:Subset#41.ENUMERABLE.[]], condition=[=(CAST($2):VARCHAR
CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary", 'Bill')]): rowcount = 2.25,
cumulative cost = {2.25 rows, 15.0 cpu, 0.0 io}, id = 4811
    EnumerableProject(subset=[rel#4203:Subset#61.ENUMERABLE.[]], deptno=[$7], deptno0=[$1],
name0=[$2], empid0=[$5]): rowcount = 15.0, cumulative cost = {15.0 rows, 60.0 cpu, 0.0 io},
id = 4206
      EnumerableJoin(subset=[rel#4204:Subset#52.ENUMERABLE.[]], condition=[=($1, $7)], joinType=[inner]):
rowcount = 15.0, cumulative cost = {116.0 rows, 0.0 cpu, 0.0 io}, id = 4795
        EnumerableTableScan(subset=[rel#158:Subset#11.ENUMERABLE.[0]], table=[[hr, m0]]):
rowcount = 1.0, cumulative cost = {0.0 rows, 1.0 cpu, 0.0 io}, id = 79
        EnumerableTableScan(subset=[rel#115:Subset#5.ENUMERABLE.[]], table=[[hr, depts]]):
rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 62
{noformat}
Its cumulative cost is {{{233.0 rows, 148.0 cpu, 0.0 io}}}. 

The new plan does not contain {{EnumerableProject}} RelNode, therefore the cumulative cost
of the plan for some RelNodes becomes smaller, but since the increase of row count caused
an increase of non-cumulative cost, the cost of best RelNode increased.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message