calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesus Camacho Rodriguez <jcamachorodrig...@hortonworks.com>
Subject Re: HepPlanner: Optimal Plan Generation
Date Thu, 18 Feb 2016 18:21:03 GMT
Hi Victor, and welcome,

I checked the code you attached and it seems you did not add any rules to the HepPlanner.
You should manually add the rules that you want to trigger, as Milinda pointed out, for instance:
hepPlanner.addRule(ReduceExpressionsRule.CALC_INSTANCE);
hepPlanner.addRule(ProjectToWindowRule.PROJECT);


Thanks,
Jesús






On 2/18/16, 6:52 PM, "Milinda Pathirage" <mpathira@umail.iu.edu> wrote:

>Hi Victor,
>
>I don't know much about how planning works internally. But it is possible
>that some rule were not applied. I think you can verify this by creating a
>HepPlanner with relevant rules and sending your plan through that. Note
>that I'm not 100% sure whether this will work or not.
>
>HepProgramBuilder hepProgramBuilder = new HepProgramBuilder();
>hepProgramBuilder.addRuleClass(ReduceExpressionsRule.class);
>hepProgramBuilder.addRuleClass(ProjectToWindowRule.class);
>HepPlanner  hepPlanner = new HepPlanner(hepProgramBuilder.build());
>hepPlanner.addRule(ReduceExpressionsRule.CALC_INSTANCE);
>hepPlanner.addRule(ProjectToWindowRule.PROJECT);
>
>hepPlanner.setRoot(convertedNode);
>
>RelNode rel = hepPlanner.findBestExp();
>
>You should customize above to add rules you need for your scenario.
>
>Thanks
>Milinda
>
>On Thu, Feb 18, 2016 at 2:00 AM, Victor Giannakouris - Salalidis <
>victorasgs@gmail.com> wrote:
>
>> Hello. I am trying to implement a planner in order to generate optimal
>> logical query plans using some statistics I provide to the schema.
>> Currently, the only available statistics is the number of rows of each
>> table.
>>
>> I am using HepPlanner. My actual problem is that when the *findBestExp()*
>> is called, the resulting plan is not optimized. That is, the query is just
>> parsed and the join order is the same as the one I provide in the input
>> query, neither filter push downs are being applied.
>>
>> For example, for the query
>>
>> "SELECT * FROM ftable f, products p WHERE f.id = p.pid AND p.pid = 2"
>>
>> the resulting plan is:
>>
>> 12:LogicalProject(id=[$0], desc=[$1], price=[$2], loc=[$3], pid=[$4],
>> pdesc=[$5]): rowcount = 225000.0, cumulative cost = 1.05002E7
>>   10:LogicalFilter(condition=[AND(=($0, $4), =($4, 2))]): rowcount =
>> 225000.0, cumulative cost = 1.02752E7
>>     8:LogicalJoin(condition=[true], joinType=[inner]): rowcount = 1.0E7,
>> cumulative cost = 1.00502E7
>>       0:EnumerableTableScan(table=[[fTable]]): rowcount = 50000.0,
>> cumulative cost = 50000.0
>>       1:EnumerableTableScan(table=[[products]]): rowcount = 200.0,
>> cumulative cost = 200.0
>>
>> I implemented this using Hive's TestCBORuleFiredOnlyOnce.java
>> <
>> https://github.com/apache/hive/blob/48b201ee163252b2127ce04fbf660df70312888a/ql/src/test/org/apache/hadoop/hive/ql/optimizer/calcite/TestCBORuleFiredOnlyOnce.java
>> >
>> and PlannerImpl.java
>> <
>> https://github.com/apache/calcite/blob/5323d8d48baa2d7bc8dea8b03bc0bda93563e0f9/core/src/main/java/org/apache/calcite/prepare/PlannerImpl.java
>> >
>> as examples and there are some classes or overrided methods which I
>> currently use as “black boxes”. Here is a link with the code of my basic
>> class: http://pastebin.com/HysfNa8S.
>>
>> --
>> Victor Giannakouris - Salalidis
>>
>> LinkedIn:
>> http://gr.linkedin.com/pub/victor-giannakouris-salalidis/69/585/b23/
>> Personal Page: http://gsvic.github.io
>>
>
>
>
>-- 
>Milinda Pathirage
>
>PhD Student | Research Assistant
>School of Informatics and Computing | Data to Insight Center
>Indiana University
>
>twitter: milindalakmal
>skype: milinda.pathirage
>blog: http://milinda.pathirage.org
Mime
View raw message