hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saravana kumar Sankaramoorthy <sarav...@axiomatics.com>
Subject GpOrca behaviour on TPC-H query 21
Date Thu, 15 Sep 2016 10:32:15 GMT

   We are using HAWQ 2.0.0 in one of our product evaluation and we are considerably new with
the technology. The HAWQ uses gporca 1.627 to come up with the execution plan. We tried analysing
Query 21 of TPC-H benchmarking downloaded from here <https://github.com/pivotalguru/demos/tree/master/TPC-H%20Benchmark/tpch_2_17_0>.
We used TPC-H scale factor 1 and run it against a 5 node docker cluster. In Query 21, the
table lineitem is referred three times. We expect gporca to apply the Common Subexpression
elimination as mentioned in this video <https://discuss.pivotal.io/hc/en-us/articles/212714417-How-to-optimize-common-table-expression-CTE-i-e-WITH-clause-statement-in-GPDB->.
But it did not apply. We manually modified the query to use CTE and found that it executes
faster than the original one. I have attached both the queries and the execution plan generated
for them. 

Why did gporca not apply the Common subexpression elimination?
If it is because of the higher cost when using CTE, expanding the definition inline will lead
to the original query and cheaper cost. The gporca should result in the original execution
plan for the modified query too. But it is not. I would like to understand why it is not happening.

I will be very glad if someone can clarify why gporca behaves like this. I hope it is the
correct forum to raise the question. If it is not, please direct me where to raise the question.
Thanks in advance.

Best Regards,
Technical Lead
Axiomatics AB

View raw message