calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enrico Olivelli <eolive...@gmail.com>
Subject Problems on HerdDB with 1.23 was [VOTE] Release apache-calcite-1.23.0 (release candidate 0)
Date Tue, 12 May 2020 09:10:31 GMT
Haisheng,
I am sorry, I have a couple of problems with HerdDB.

1) JOIN order unsorted columns in presence of a WHERE over other columns
This is my case:

CREATE TABLE tblspace1.table1 (k1 string primary key,n1 int,s1 string)
CREATE TABLE tblspace1.table3 (k1 string primary key,n3 int,s3 string)
SELECT t1.k1 as first, t2.k1 as second
FROM            tblspace1.table1 t1
 INNER JOIN tblspace1.table3 t2 ON t1.k1=t2.k1
 WHERE t1.n1 + 1 = t2.n3

In this case for table1 and table3 no column is physically sorted (no
column with a collation)

I have this Planner error:
java.lang.AssertionError: cannot merge join: left input is not sorted on
left keys
at
org.apache.calcite.rel.metadata.RelMdCollation.mergeJoin(RelMdCollation.java:457)
at
org.apache.calcite.rel.metadata.RelMdCollation.collations(RelMdCollation.java:153)
at GeneratedMetadataHandler_Collation.collations_$(Unknown Source)
at GeneratedMetadataHandler_Collation.collations(Unknown Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.collations(RelMetadataQuery.java:539)
at
org.apache.calcite.rel.metadata.RelMdCollation.project(RelMdCollation.java:273)
at
org.apache.calcite.rel.logical.LogicalProject.lambda$create$0(LogicalProject.java:122)
at org.apache.calcite.plan.RelTraitSet.replaceIfs(RelTraitSet.java:242)
at
org.apache.calcite.rel.logical.LogicalProject.create(LogicalProject.java:121)
at
org.apache.calcite.rel.logical.LogicalProject.create(LogicalProject.java:111)
at
org.apache.calcite.rel.core.RelFactories$ProjectFactoryImpl.createProject(RelFactories.java:172)
at org.apache.calcite.tools.RelBuilder.project_(RelBuilder.java:1464)
at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1258)
at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1230)
at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1219)
at
org.apache.calcite.plan.RelOptUtil.pushDownJoinConditions(RelOptUtil.java:3620)
at
org.apache.calcite.rel.rules.JoinPushExpressionsRule.onMatch(JoinPushExpressionsRule.java:59)
at
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:221)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:519)
at herddb.sql.CalcitePlanner.runPlanner(CalcitePlanner.java:535)
at herddb.sql.CalcitePlanner.translate(CalcitePlanner.java:292)

*If I remove the "WHERE" clause then no error is reported.*
we have many  other test cases about JOINs and this one is the only one
that fails

This is the failing test case on HerdDB
https://github.com/diennea/herddb/blob/vote-calcite-123/herddb-core/src/test/java/herddb/core/SimpleJoinTest.java#L522

We are using the default set of rules Programs.ofRules(Programs.RULE_SET)

I will try to create a reproducer in Calcite core test suite, in order to
understand if it is a bug in HerdDB or in Calcite
but I am reporting the problem as early as possible.
We wanted to create a daily job that tests HerdDB against current Calcite
master but unfortunately we still have not find the time to do it.

2) Changed the data type of sum(N) from BIGINT to INTEGER

I also noted that sometimes the type of sum(N) where N is an INTEGER column
sometimes it is now reported by Calcite as INTEGER and sometimes as
a BIGINT. In 1.22 every time is reported as BIGINT.
So we have another test failing.

SELECT sum(n1), count(*) as cc, k1
FROM tblspace1.tsql
GROUP by k1
ORDER BY sum(n1)

Here sum(n1) is reported now a INTEGER, previously it was a BIGINT. I would
prefer to see it as a BIGINT in order to prevent overflows

Here are the plans:
INFO: Query: SELECT sum(n1), count(*) as cc, k1  FROM tblspace1.tsql GROUP
by k1 ORDER BY sum(n1) -- Logical Plan
LogicalSort(sort0=[$0], dir0=[ASC]): rowcount = 2.0, cumulative cost =
{10.525000095367432 rows, 37.0 cpu, 0.0 io}, id = 1038
  LogicalProject(EXPR$0=[$1], CC=[$2], K1=[$0]): rowcount = 2.0, cumulative
cost = {8.525000095367432 rows, 13.0 cpu, 0.0 io}, id = 1037
    LogicalAggregate(group=[{0}], EXPR$0=[SUM($1)], CC=[COUNT()]): rowcount
= 2.0, cumulative cost = {6.525000095367432 rows, 7.0 cpu, 0.0 io}, id =
1035
      LogicalProject(K1=[$0], n1=[$1]): rowcount = 2.0, cumulative cost =
{4.0 rows, 7.0 cpu, 0.0 io}, id = 1034
        LogicalTableScan(table=[[tblspace1, tsql]]): rowcount = 2.0,
cumulative cost = {2.0 rows, 3.0 cpu, 0.0 io}, id = 1032

May 12, 2020 11:07:37 AM herddb.sql.CalcitePlanner runPlanner
INFO: Query: SELECT sum(n1), count(*) as cc, k1  FROM tblspace1.tsql GROUP
by k1 ORDER BY sum(n1) -- Best  Plan
EnumerableSort(sort0=[$0], dir0=[ASC]): rowcount = 2.0, cumulative cost =
{5.0 rows, 31.0 cpu, 0.0 io}, id = 1245
  EnumerableProject(EXPR$0=[$1], CC=[1:BIGINT], K1=[$0]): rowcount = 2.0,
cumulative cost = {3.0 rows, 7.0 cpu, 0.0 io}, id = 1244
    EnumerableInterpreter: rowcount = 2.0, cumulative cost = {1.0 rows, 1.0
cpu, 0.0 io}, id = 1243
      BindableTableScan(table=[[tblspace1, tsql]], projects=[[0, 1]]):
rowcount = 2.0, cumulative cost = {0.016 rows, 0.024 cpu, 0.0 io}, id = 1055


Within the same test case with the same tables the result of this query is
not changed
SELECT sum(n1) as ss, min(n1) as mi, max(n1) as ma FROM tblspace1.tsql
INFO: Query: SELECT sum(n1) as ss, min(n1) as mi, max(n1) as ma FROM
tblspace1.tsql -- Logical Plan
LogicalAggregate(group=[{}], SS=[SUM($0)], MI=[MIN($0)], MA=[MAX($0)]):
rowcount = 1.0, cumulative cost = {5.387500047683716 rows, 5.0 cpu, 0.0
io}, id = 1253
  LogicalProject(n1=[$1]): rowcount = 2.0, cumulative cost = {4.0 rows, 5.0
cpu, 0.0 io}, id = 1252
    LogicalTableScan(table=[[tblspace1, tsql]]): rowcount = 2.0, cumulative
cost = {2.0 rows, 3.0 cpu, 0.0 io}, id = 1250

May 12, 2020 11:08:48 AM herddb.sql.CalcitePlanner runPlanner
INFO: Query: SELECT sum(n1) as ss, min(n1) as mi, max(n1) as ma FROM
tblspace1.tsql -- Best  Plan
EnumerableAggregate(group=[{}], SS=[SUM($0)], MI=[MIN($0)], MA=[MAX($0)]):
rowcount = 1.0, cumulative cost = {2.387500047683716 rows, 1.0 cpu, 0.0
io}, id = 1295
  EnumerableInterpreter: rowcount = 2.0, cumulative cost = {1.0 rows, 1.0
cpu, 0.0 io}, id = 1294
    BindableTableScan(table=[[tblspace1, tsql]], projects=[[1]]): rowcount
= 2.0, cumulative cost = {0.012 rows, 0.018000000000000002 cpu, 0.0 io}, id
= 1265

This is the test on HerdDB
https://github.com/diennea/herddb/blob/vote-calcite-123/herddb-core/src/test/java/herddb/sql/SimplerPlannerTest.java#L237

I hope that helps
Enrico


Il giorno mar 12 mag 2020 alle ore 07:59 Haisheng Yuan <hyuan@apache.org>
ha scritto:

> Hi all,
>
> I have created a build for Apache Calcite 1.23.0, release
> candidate 0.
>
> Thanks to everyone who has contributed to this release.
>
> You can read the release notes here:
>
> https://github.com/apache/calcite/blob/calcite-1.23.0-rc0/site/_docs/history.md
>
> The commit to be voted upon:
>
> https://gitbox.apache.org/repos/asf?p=calcite.git;a=commit;h=edc37c0a21344a48b15877788e082c8acdc7b030
>
> Its hash is edc37c0a21344a48b15877788e082c8acdc7b030
>
> Tag:
> https://github.com/apache/calcite/tree/calcite-1.23.0-rc0
>
> The artifacts to be voted on are located here:
> https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.23.0-rc0
> (revision 39385)
>
> The hashes of the artifacts are as follows:
>
> 7482b0bb76e672a15bbe846f2dbdc125bd0f3d8a32abf0ea9159b5db0ab2a2d1182e19b408098ecd68d7cc9ff5d7812ea0b33e4aeac818d191b695d437fa1a94
> *apache-calcite-1.23.0-src.tar.gz
>
> A staged Maven repository is available for review at:
>
> https://repository.apache.org/content/repositories/orgapachecalcite-1088/org/apache/calcite/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/hyuan.asc
> https://dist.apache.org/repos/dist/release/calcite/KEYS
>
> N.B.
> To create the jars and test Apache Calcite: "./gradlew build".
>
> If you do not have a Java environment available, you can run the tests
> using docker. To do so, install docker and docker-compose, then run
> "docker-compose run test" from the root of the directory.
>
> Please vote on releasing this package as Apache Calcite 1.23.0.
>
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Calcite 1.23.0
> [ ]  0 I don't feel strongly about it, but I'm okay with the release
> [ ] -1 Do not release this package because...
>
>
> Here is my vote:
>
> +1 (binding)
>
> Thanks,
> Haisheng Yuan
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message