calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Kedin <ke...@google.com.INVALID>
Subject Re: Array element access and nested rows
Date Thu, 22 Mar 2018 16:46:40 GMT
I think we're talking about the same thing. In my case the index is also
not adjusted for the flattened row. And I also see the wrong field ordinal
which causes the same kind of mismatch, but it happens higher up in the
user land for me because of how we wrap Calcite code (I'm not working with
Calcite source at the moment).

I will build the Calcite with your fix and will report if it fixes my
issue. Thanks for the help!

Regards,
Anton



On Thu, Mar 22, 2018 at 1:18 AM Shuyi Chen <suez1224@gmail.com> wrote:

> Also, you can try to patch in this PR to see if that fixes your issue,
> https://github.com/apache/calcite/pull/651.
>
> On Thu, Mar 22, 2018 at 12:14 AM, Shuyi Chen <suez1224@gmail.com> wrote:
>
> > I think the following is what happened:
> >
> > Calcite is trying to remove all structured type in the plan right below,
> > so optimizer and codegen rules never have to deal with structured types.
> >
> > LogicalProject(EXPR$0=[ITEM($3, 1)])
> >   LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
> >
> > First, it flatten the LogicalTableScan, and generate the following:
> >
> > LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE], DESC=[$2.DESC],
> > EMPLOYEES=[$3])
> >   LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
> >
> > Then it tries to flatten "LogicalProject(EXPR$0=[ITEM($3, 1)])", and
> > generate the following:
> >
> > LogicalProject(EXPR$0$0=[ITEM($3, 1).EMPNO], EXPR$0$1=[ITEM($3,
> > 1).ENAME], EXPR$0$2=[ITEM($3, 1).SKILLS])
> >
> > However, when it combines the 2 flattening results, it did not correctly
> > adjust the ordinal post-flattening, which should be $4 now, not $3. So
> this
> > cause the exception since it is a type mismatch.
> >
> > I think I've already developed a fix for this. Will create a PR to
> address
> > both issues.
> >
> > @Anton, although my test error and your issue look similar, I still can't
> > reproduce your case (mine throws an error). Can you create a test for it?
> > Thanks a lot.
> >
> >
> >
> >
> >
> > On Wed, Mar 21, 2018 at 9:31 PM, Anton Kedin <kedin@google.com.invalid>
> > wrote:
> >
> >> Shuyi,
> >>
> >> Thank you for looking into this. Can this error in your case be caused
> by
> >> a
> >> similar problem? E.g. SKILLRECORD gets flattened, then when you try to
> >> select employees[1] you get SKILLRECORD.DESC field instead of actual
> >> employees[1] because input ref index is not adjusted for the flattened
> >> SKILLRECORD?
> >>
> >> Thank you,
> >> Anton
> >>
> >>
> >> On Wed, Mar 21, 2018 at 8:53 PM Shuyi Chen <suez1224@gmail.com> wrote:
> >>
> >> > Actually, the cause for my previous findings is: for the first case,
> >> > SqlToRelConverterTest introduce another LogicalProject
> (RelRoot.project)
> >> > after applying the SqlToRelConverter to remove fields that are not
> >> needed.
> >> > But this function does not work with Record type and flattened fields.
> >> It
> >> > simply projects the first several fields from input index-wise, and
> does
> >> > not take into account the flattening behavior. The second case does
> not
> >> > trigger the extra project because it's trivial.
> >> >
> >> > For your case, I tried below:
> >> >
> >> > MockTable deptNestedTable =
> >> >     MockTable.create(this, salesSchema, "DEPT_NESTED", false, 4);
> >> > deptNestedTable.addColumn("DEPTNO", f.intType, true);
> >> > deptNestedTable.addColumn("NAME", f.varchar10Type);
> >> > deptNestedTable.addColumn("SKILLRECORD", f.skillRecordType);
> >> > deptNestedTable.addColumn("EMPLOYEES", f.empListType);
> >> > registerTable(deptNestedTable);
> >> >
> >> > Run the following test:
> >> >
> >> > @Test public void testArrayOfRecord() {
> >> >   sql("select employees[1] from dept_nested").ok();
> >> > }
> >> >
> >> > I am actually getting the following error when run:
> >> >
> >> > java.lang.AssertionError: type mismatch:
> >> > ref:
> >> > RecordType(INTEGER NOT NULL EMPNO, VARCHAR(10) CHARACTER SET
> >> "ISO-8859-1"
> >> > COLLATE "ISO-8859-1$en_US$primary" NOT NULL ENAME,
> >> RecordType(VARCHAR(10)
> >> > CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary" NOT NULL
> >> > TYPE, VARCHAR(20) CHARACTER SET "ISO-8859-1" COLLATE
> >> > "ISO-8859-1$en_US$primary" NOT NULL DESC) NOT NULL ARRAY NOT NULL
> >> SKILLS)
> >> > NOT NULL ARRAY NOT NULL
> >> > input:
> >> > VARCHAR(20) CHARACTER SET "ISO-8859-1" COLLATE
> >> "ISO-8859-1$en_US$primary"
> >> > NOT NULL
> >> >
> >> > at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
> >> > at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1838)
> >> > at
> org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:125)
> >> > at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:57)
> >> > at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
> >> > at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:140)
> >> > at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:57)
> >> > at org.apache.calcite.rex.RexCall.accept(RexCall.java:107)
> >> > at
> >> >
> >> > org.apache.calcite.rex.RexVisitorImpl.visitFieldAccess(RexVi
> >> sitorImpl.java:98)
> >> > at org.apache.calcite.rex.RexChecker.visitFieldAccess(RexChecke
> >> r.java:149)
> >> > at org.apache.calcite.rex.RexChecker.visitFieldAccess(RexChecke
> >> r.java:57)
> >> > at
> org.apache.calcite.rex.RexFieldAccess.accept(RexFieldAccess.java:81)
> >> > at org.apache.calcite.rel.core.Project.isValid(Project.java:187)
> >> > at org.apache.calcite.rel.core.Project.<init>(Project.java:84)
> >> > at
> >> >
> >> > org.apache.calcite.rel.logical.LogicalProject.<init>(Logical
> >> Project.java:65)
> >> > at
> >> >
> >> > org.apache.calcite.rel.logical.LogicalProject.create(Logical
> >> Project.java:120)
> >> > at
> >> >
> >> > org.apache.calcite.rel.logical.LogicalProject.create(Logical
> >> Project.java:103)
> >> > at
> >> >
> >> > org.apache.calcite.rel.core.RelFactories$ProjectFactoryImpl.
> >> createProject(RelFactories.java:127)
> >> > at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1064)
> >> > at org.apache.calcite.plan.RelOptUtil.createProject(RelOptUtil.
> >> java:2956)
> >> > at org.apache.calcite.plan.RelOptUtil.createProject(RelOptUtil.
> >> java:2873)
> >> > at
> >> >
> >> > org.apache.calcite.sql2rel.RelStructuredTypeFlattener.rewrit
> >> eRel(RelStructuredTypeFlattener.java:477)
> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> > at
> >> >
> >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
> >> ssorImpl.java:62)
> >> > at
> >> >
> >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
> >> thodAccessorImpl.java:43)
> >> > at java.lang.reflect.Method.invoke(Method.java:498)
> >> > at
> >> >
> >> > org.apache.calcite.util.ReflectUtil.invokeVisitorInternal(Re
> >> flectUtil.java:257)
> >> > at org.apache.calcite.util.ReflectUtil.invokeVisitor(ReflectUti
> >> l.java:214)
> >> > at
> >> > org.apache.calcite.util.ReflectUtil$1.invokeVisitor(ReflectU
> >> til.java:464)
> >> > at
> >> >
> >> > org.apache.calcite.sql2rel.RelStructuredTypeFlattener$Rewrit
> >> eRelVisitor.visit(RelStructuredTypeFlattener.java:721)
> >> > at
> >> >
> >> > org.apache.calcite.sql2rel.RelStructuredTypeFlattener.rewrit
> >> e(RelStructuredTypeFlattener.java:177)
> >> > at
> >> >
> >> > org.apache.calcite.sql2rel.SqlToRelConverter.flattenTypes(Sq
> >> lToRelConverter.java:462)
> >> > at
> >> >
> >> > org.apache.calcite.test.SqlToRelTestBase$TesterImpl.convertS
> >> qlToRel(SqlToRelTestBase.java:585)
> >> > at
> >> >
> >> > org.apache.calcite.test.SqlToRelTestBase$TesterImpl.assertCo
> >> nvertsTo(SqlToRelTestBase.java:690)
> >> > at
> >> >
> >> > org.apache.calcite.test.SqlToRelConverterTest$Sql.convertsTo
> >> (SqlToRelConverterTest.java:2784)
> >> > at
> >> >
> >> > org.apache.calcite.test.SqlToRelConverterTest$Sql.ok(SqlToRe
> >> lConverterTest.java:2776)
> >> > at
> >> >
> >> > org.apache.calcite.test.SqlToRelConverterTest.testArrayOfRec
> >> ord(SqlToRelConverterTest.java:1059)
> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> > at
> >> >
> >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
> >> ssorImpl.java:62)
> >> > at
> >> >
> >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
> >> thodAccessorImpl.java:43)
> >> > at java.lang.reflect.Method.invoke(Method.java:498)
> >> > at
> >> >
> >> > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
> >> FrameworkMethod.java:50)
> >> > at
> >> >
> >> > org.junit.internal.runners.model.ReflectiveCallable.run(Refl
> >> ectiveCallable.java:12)
> >> > at
> >> >
> >> > org.junit.runners.model.FrameworkMethod.invokeExplosively(Fr
> >> ameworkMethod.java:47)
> >> > at
> >> >
> >> > org.junit.internal.runners.statements.InvokeMethod.evaluate(
> >> InvokeMethod.java:17)
> >> > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> >> > at
> >> >
> >> > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
> >> 4ClassRunner.java:78)
> >> > at
> >> >
> >> > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
> >> 4ClassRunner.java:57)
> >> > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> >> > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> >> > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> >> > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> >> > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> >> > at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> >> > at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> >> > at
> >> >
> >> > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs
> >> (JUnit4IdeaTestRunner.java:117)
> >> > at
> >> >
> >> > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs
> >> (JUnit4IdeaTestRunner.java:42)
> >> > at
> >> >
> >> > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsA
> >> ndStart(JUnitStarter.java:262)
> >> > at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStart
> >> er.java:84)
> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> > at
> >> >
> >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
> >> ssorImpl.java:62)
> >> > at
> >> >
> >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
> >> thodAccessorImpl.java:43)
> >> > at java.lang.reflect.Method.invoke(Method.java:498)
> >> > at
> com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> >> >
> >> > Shuyi
> >> >
> >> >
> >> > On Wed, Mar 21, 2018 at 6:09 PM, Shuyi Chen <suez1224@gmail.com>
> wrote:
> >> >
> >> > > Thanks a lot, Anton. This seems to be a bug in Calcite. When the
> >> > statement
> >> > > involving record types, sql validation seems to work, but the rel
> plan
> >> > > generated might be wrong.  I can also reproduce your case:
> >> > >
> >> > > MockTable deptNestedTable =
> >> > >     MockTable.create(this, salesSchema, "DEPT_NESTED", false, 4);
> >> > > deptNestedTable.addColumn("DEPTNO", f.intType, true);
> >> > > deptNestedTable.addColumn("NAME", f.varchar10Type);
> >> > > deptNestedTable.addColumn("SKILLRECORD", f.skillRecordType);
> >> > > deptNestedTable.addColumn("EMPLOYEES", f.empListType);
> >> > > registerTable(deptNestedTable);
> >> > >
> >> > > Run the following test:
> >> > >
> >> > > @Test public void testArrayOfRecord() {
> >> > >   sql("select skillrecord, employees from dept_nested").ok();
> >> > > }
> >> > >
> >> > > yield:
> >> > > LogicalProject(SKILLRECORD=[$0], EMPLOYEES=[$1])
> >> > >   LogicalProject(SKILLRECORD=[$2], SKILLRECORD1=[$3],
> EMPLOYEES=[$4])
> >> > >     LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE],
> >> > DESC=[$2.DESC],
> >> > > EMPLOYEES=[$3])
> >> > >       LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
> >> > >
> >> > > Sometimes, it works:
> >> > >
> >> > > @Test public void testArrayOfRecord() {
> >> > > sql("select name, employees from dept_nested").ok();
> >> > > }
> >> > >
> >> > > yield:
> >> > >
> >> > > LogicalProject(NAME=[$1], EMPLOYEES=[$4])
> >> > >   LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE],
> >> DESC=[$2.DESC],
> >> > > EMPLOYEES=[$3])
> >> > >     LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
> >> > >
> >> > > I can take a deeper look.
> >> > >
> >> > > Shuyi
> >> > >
> >> > > On Wed, Mar 21, 2018 at 11:06 AM, Anton Kedin
> >> <kedin@google.com.invalid>
> >> > > wrote:
> >> > >
> >> > >> Hi,
> >> > >>
> >> > >> I have an issue I am not sure how to handle, would appreciate
any
> >> > >> pointers.
> >> > >>
> >> > >> I have a table with row type:
> >> > >> RecordType(
> >> > >>     INTEGER orderId,
> >> > >>     RecordType(VARCHAR name, INTEGER personId)
> >> > >>         person,
> >> > >>     RecordType(VARCHAR sku, INTEGER price, VARCHAR currency,
> VARCHAR
> >> > ARRAY
> >> > >> tags)
> >> > >>         ARRAY items
> >> > >> )
> >> > >>
> >> > >> With this row type I am trying to model a JSON object which looks
> >> like
> >> > >> this:
> >> > >> { "orderId" : 1,
> >> > >>   "person" : { "name" : "john", "personId" : 12, },
> >> > >>   "items": [
> >> > >>     { "sku" : "aaa01", "price" : 12, "currency" : "USD", "tags"
:
> >> > ["blue",
> >> > >> "book"] }
> >> > >>   ]}
> >> > >>
> >> > >> When selecting the whole items array I get the following plan:
> >> > >> SELECT items FROM PCOLLECTION
> >> > >>
> >> > >> LogicalProject(items=[$3])
> >> > >>   LogicalProject(orderId=[$0], name=[$1.name],
> >> personId=[$1.personId],
> >> > >> items
> >> > >> =[$2])
> >> > >>     LogicalTableScan(table=[[PCOLLECTION]])
> >> > >>
> >> > >> Which looks correct and it works. One thing to note here is that
> >> Calcite
> >> > >> flattens the person row, and makes the input ref for the items
> field
> >> as
> >> > >> $3,
> >> > >> as expected.
> >> > >>
> >> > >> But when I want to get a specific element from that array I get
the
> >> > >> following:
> >> > >> SELECT items[0] FROM PCOLLECTION
> >> > >>
> >> > >> LogicalProject(EXPR$0$0=[ITEM($2, 0).sku], EXPR$0$1=[ITEM($2,
> >> 0).price],
> >> > >> EXPR$0$2=[ITEM($2, 0).currency], EXPR$0$3=[ITEM($2, 0).tags])
> >> > >>   LogicalProject(orderId=[$0], name=[$1.name],
> >> personId=[$1.personId],
> >> > >> items
> >> > >> =[$2])
> >> > >>     LogicalTableScan(table=[[PCOLLECTION]])
> >> > >>
> >> > >> The first project looks the same. Flattened person row, items
> array,
> >> all
> >> > >> looks similar to the above.
> >> > >> But the outer project calls ITEM($2, i). I would expect it to
be
> >> > >> ITEM($3, i) instead,
> >> > >> to adjust for the flattened person row, but it keeps the index
as
> $2,
> >> > >> which
> >> > >> would have been the correct index if the row was not flattened,
but
> >> it
> >> > >> should be $3 for flattened row, similar to the previous example.
> >> > >>
> >> > >> Is there something I am missing or is it a bug and Calcite should
> >> adjust
> >> > >> the input ref index to account for flattened rows in this case
as
> >> well?
> >> > >>
> >> > >> Thank you,
> >> > >> Anton
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > "So you have to trust that the dots will somehow connect in your
> >> future."
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > "So you have to trust that the dots will somehow connect in your
> >> future."
> >> >
> >>
> >
> >
> >
> > --
> > "So you have to trust that the dots will somehow connect in your future."
> >
>
>
>
> --
> "So you have to trust that the dots will somehow connect in your future."
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message