trafodion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Subbiah <suresh.subbia...@gmail.com>
Subject Re: Bug in MERGE/UPSERT transformation
Date Wed, 09 Sep 2015 00:09:36 GMT
Hi Dave,

It is interesting that costing changes caused the second plan to be
considered and chosen.
I have been working on transforming UPSERT to MERGE recently and have often
seen both kind of plans.
The first kind (which you said works correctly) is what we get for
something like
UPSERT into t values (1,1,1) -- if t had 2 indexes

We get a variant of second plan for these types of UPSERT statememts
UPSERT into t values select * from t1 ;
UPSERT into t values (1,1,1),(2,2,2) ;
However the plans I saw for the second case differ from what you observed
in that the scan drives the rest of the query tree and not just the merge.
In other words the nested join has scan as its left child and the topmost
nested in the second plan as its right child.

I don't know what is causing these new plans, but feel it has to be through
one of the optimizer rules. We can verify that with the display tool.
Looking forward to learn from your findings.

Thanks
Suresh

PS All these UPSERT statements can be cast into equivalent MEGE statements
if necessary

On Tue, Sep 8, 2015 at 5:51 PM, Dave Birdsall <dave.birdsall@esgyn.com>
wrote:

> Hi,
>
>
>
> I found out that it wasn’t that easy. Tried the proposed fix but that
> didn’t solve the problem. And indeed the TSJFlowRule::topMatch wasn’t
> firing anyway since expr->getNoFlow() was returning TRUE. So the join
> transformation was being introduced somewhere else.
>
>
>
> After some more experimentation I learned that the table in question has
> two indexes in the test script. And if I get rid of either one, the bug
> does not occur. Which makes me think some transformation involving multiple
> indexes is the culprit.
>
>
>
> I’ll keep debugging…
>
>
>
> Dave
>
>
>
> *From:* Dave Birdsall [mailto:dave.birdsall@esgyn.com]
> *Sent:* Tuesday, September 8, 2015 2:44 PM
> *To:* 'dev@trafodion.incubator.apache.org' <
> dev@trafodion.incubator.apache.org>
> *Subject:* Bug in MERGE/UPSERT transformation
>
>
>
> Hi,
>
>
>
> I’ve been playing with costing code (specifically, adding costing code for
> HBase deletes and updates). While debugging these changes, I ran executor
> regressions. TEST015 fails, with some MERGE and UPSERT statements giving
> wrong results.
>
>
>
> In a correct run (with my changes turned off), I see a plan like this:
>
>
>
> >>explain options 'f' merge into t015t1 on a = 1 when matched then update
> set b = 01
>
> +> when not matched then insert values (1,2);
>
>
>
> LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
>
> ---- ---- ---- --------------------  --------  --------------------
> ---------
>
>
>
> 9    .    10   root                  o         x
> 4.00E+000
>
> 1    8    9    nested_join
>                4.00E+000
>
> 4    7    8    merge_union
> 4.00E+000
>
> 5    6    7    blocked_union
> 2.00E+000
>
> .    .    6    trafodion_insert                T015T1I2
> 1.00E+000
>
> .    .    5    trafodion_vsbb_delet            T015T1I2
> 1.00E+000
>
> 2    3    4    blocked_union
> 2.00E+000
>
> .    .    3    trafodion_insert                T015T1I1
> 1.00E+000
>
> .    .    2    trafodion_delete                T015T1I1
> 1.00E+000
>
> .    .    1    trafodion_merge                 T015T1
> 1.00E+000
>
>
>
> --- SQL operation complete.
>
> >>log;
>
>
>
> In a failed run (with my changes turned on), I see a plan like this:
>
>
>
> >>explain options 'f' merge into t015t1 on a=1 when matched then
>
> +> update set b = -1
>
> +> when not matched then insert values (1,2);
>
>
>
> LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
>
> ---- ---- ---- --------------------  --------  --------------------
> ---------
>
>
>
> 11   .    12   root                            x
> 4.00E+000
>
> 3    10   11   nested_join
>                                4.00E+000
>
> 6    9    10   merge_union
> 4.00E+000
>
> 7    8    9    blocked_union
> 2.00E+000
>
> .    .    8    trafodion_insert                T015T1I2
>             1.00E+000
>
> .    .    7    trafodion_vsbb_delet            T015T1I2
> 1.00E+000
>
> 4    5    6    blocked_union
> 2.00E+000
>
> .    .    5    trafodion_insert                T015T1I1
> 1.00E+000
>
> .    .    4    trafodion_delete                T015T1I1
> 1.00E+000
>
> 1    2    3    nested_join
> 1.00E+000
>
> .    .    2    trafodion_merge                 T015T1
> 1.00E+000
>
> .    .    1    trafodion_scan                  T015T1
> 1.00E+000
>
>
>
> --- SQL operation complete.
>
>
>
> In this example, the table T015T1 has two indexes, which is why the
> additional operators are present. The index maintenance part looks correct
> in both plans.
>
>
>
> The thing that looks incorrect is the transformation of a “trafodion merge”
> to the base table into a “nested join” of a scan of that table and a
> “trafodion merge” of that table. Such a transformation would be correct for
> updates and deletes (and indeed we do this for delete in order to take
> advantage of the HBase API that allows multiple deletes on one call; that
> shows up in explain as “trafodion_vsbb_dele”). But I don’t think it should
> ever be correct for a merge/upsert, since part of the action of a
> merge/upsert depends on the **absence** of particular keys. A scan + nested
> join cannot express absence, only presence.
>
>
>
> I think the fix is in TSJFlowRule::topMatch (in optimizer/Transrule.cpp),
> we should return false if expr->isMerge() is true.
>
>
>
> What do you think?
>
>
>
> Dave
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message