calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Kedin <ke...@google.com.INVALID>
Subject Re: Array element access and nested rows
Date Thu, 22 Mar 2018 17:25:25 GMT
Shuyi,

The fix works for me. Thanks!

Regards,
Anton


On Thu, Mar 22, 2018 at 9:46 AM Anton Kedin <kedin@google.com> wrote:

> I think we're talking about the same thing. In my case the index is also
> not adjusted for the flattened row. And I also see the wrong field ordinal
> which causes the same kind of mismatch, but it happens higher up in the
> user land for me because of how we wrap Calcite code (I'm not working with
> Calcite source at the moment).
>
> I will build the Calcite with your fix and will report if it fixes my
> issue. Thanks for the help!
>
> Regards,
> Anton
>
>
>
> On Thu, Mar 22, 2018 at 1:18 AM Shuyi Chen <suez1224@gmail.com> wrote:
>
>> Also, you can try to patch in this PR to see if that fixes your issue,
>> https://github.com/apache/calcite/pull/651.
>>
>> On Thu, Mar 22, 2018 at 12:14 AM, Shuyi Chen <suez1224@gmail.com> wrote:
>>
>> > I think the following is what happened:
>> >
>> > Calcite is trying to remove all structured type in the plan right below,
>> > so optimizer and codegen rules never have to deal with structured types.
>> >
>> > LogicalProject(EXPR$0=[ITEM($3, 1)])
>> >   LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
>> >
>> > First, it flatten the LogicalTableScan, and generate the following:
>> >
>> > LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE], DESC=[$2.DESC],
>> > EMPLOYEES=[$3])
>> >   LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
>> >
>> > Then it tries to flatten "LogicalProject(EXPR$0=[ITEM($3, 1)])", and
>> > generate the following:
>> >
>> > LogicalProject(EXPR$0$0=[ITEM($3, 1).EMPNO], EXPR$0$1=[ITEM($3,
>> > 1).ENAME], EXPR$0$2=[ITEM($3, 1).SKILLS])
>> >
>> > However, when it combines the 2 flattening results, it did not correctly
>> > adjust the ordinal post-flattening, which should be $4 now, not $3. So
>> this
>> > cause the exception since it is a type mismatch.
>> >
>> > I think I've already developed a fix for this. Will create a PR to
>> address
>> > both issues.
>> >
>> > @Anton, although my test error and your issue look similar, I still
>> can't
>> > reproduce your case (mine throws an error). Can you create a test for
>> it?
>> > Thanks a lot.
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Mar 21, 2018 at 9:31 PM, Anton Kedin <kedin@google.com.invalid>
>> > wrote:
>> >
>> >> Shuyi,
>> >>
>> >> Thank you for looking into this. Can this error in your case be caused
>> by
>> >> a
>> >> similar problem? E.g. SKILLRECORD gets flattened, then when you try to
>> >> select employees[1] you get SKILLRECORD.DESC field instead of actual
>> >> employees[1] because input ref index is not adjusted for the flattened
>> >> SKILLRECORD?
>> >>
>> >> Thank you,
>> >> Anton
>> >>
>> >>
>> >> On Wed, Mar 21, 2018 at 8:53 PM Shuyi Chen <suez1224@gmail.com> wrote:
>> >>
>> >> > Actually, the cause for my previous findings is: for the first case,
>> >> > SqlToRelConverterTest introduce another LogicalProject
>> (RelRoot.project)
>> >> > after applying the SqlToRelConverter to remove fields that are not
>> >> needed.
>> >> > But this function does not work with Record type and flattened
>> fields.
>> >> It
>> >> > simply projects the first several fields from input index-wise, and
>> does
>> >> > not take into account the flattening behavior. The second case does
>> not
>> >> > trigger the extra project because it's trivial.
>> >> >
>> >> > For your case, I tried below:
>> >> >
>> >> > MockTable deptNestedTable =
>> >> >     MockTable.create(this, salesSchema, "DEPT_NESTED", false, 4);
>> >> > deptNestedTable.addColumn("DEPTNO", f.intType, true);
>> >> > deptNestedTable.addColumn("NAME", f.varchar10Type);
>> >> > deptNestedTable.addColumn("SKILLRECORD", f.skillRecordType);
>> >> > deptNestedTable.addColumn("EMPLOYEES", f.empListType);
>> >> > registerTable(deptNestedTable);
>> >> >
>> >> > Run the following test:
>> >> >
>> >> > @Test public void testArrayOfRecord() {
>> >> >   sql("select employees[1] from dept_nested").ok();
>> >> > }
>> >> >
>> >> > I am actually getting the following error when run:
>> >> >
>> >> > java.lang.AssertionError: type mismatch:
>> >> > ref:
>> >> > RecordType(INTEGER NOT NULL EMPNO, VARCHAR(10) CHARACTER SET
>> >> "ISO-8859-1"
>> >> > COLLATE "ISO-8859-1$en_US$primary" NOT NULL ENAME,
>> >> RecordType(VARCHAR(10)
>> >> > CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary" NOT
>> NULL
>> >> > TYPE, VARCHAR(20) CHARACTER SET "ISO-8859-1" COLLATE
>> >> > "ISO-8859-1$en_US$primary" NOT NULL DESC) NOT NULL ARRAY NOT NULL
>> >> SKILLS)
>> >> > NOT NULL ARRAY NOT NULL
>> >> > input:
>> >> > VARCHAR(20) CHARACTER SET "ISO-8859-1" COLLATE
>> >> "ISO-8859-1$en_US$primary"
>> >> > NOT NULL
>> >> >
>> >> > at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>> >> > at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1838)
>> >> > at
>> org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:125)
>> >> > at
>> org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:57)
>> >> > at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
>> >> > at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:140)
>> >> > at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:57)
>> >> > at org.apache.calcite.rex.RexCall.accept(RexCall.java:107)
>> >> > at
>> >> >
>> >> > org.apache.calcite.rex.RexVisitorImpl.visitFieldAccess(RexVi
>> >> sitorImpl.java:98)
>> >> > at org.apache.calcite.rex.RexChecker.visitFieldAccess(RexChecke
>> >> r.java:149)
>> >> > at org.apache.calcite.rex.RexChecker.visitFieldAccess(RexChecke
>> >> r.java:57)
>> >> > at
>> org.apache.calcite.rex.RexFieldAccess.accept(RexFieldAccess.java:81)
>> >> > at org.apache.calcite.rel.core.Project.isValid(Project.java:187)
>> >> > at org.apache.calcite.rel.core.Project.<init>(Project.java:84)
>> >> > at
>> >> >
>> >> > org.apache.calcite.rel.logical.LogicalProject.<init>(Logical
>> >> Project.java:65)
>> >> > at
>> >> >
>> >> > org.apache.calcite.rel.logical.LogicalProject.create(Logical
>> >> Project.java:120)
>> >> > at
>> >> >
>> >> > org.apache.calcite.rel.logical.LogicalProject.create(Logical
>> >> Project.java:103)
>> >> > at
>> >> >
>> >> > org.apache.calcite.rel.core.RelFactories$ProjectFactoryImpl.
>> >> createProject(RelFactories.java:127)
>> >> > at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1064)
>> >> > at org.apache.calcite.plan.RelOptUtil.createProject(RelOptUtil.
>> >> java:2956)
>> >> > at org.apache.calcite.plan.RelOptUtil.createProject(RelOptUtil.
>> >> java:2873)
>> >> > at
>> >> >
>> >> > org.apache.calcite.sql2rel.RelStructuredTypeFlattener.rewrit
>> >> eRel(RelStructuredTypeFlattener.java:477)
>> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >> > at
>> >> >
>> >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> >> ssorImpl.java:62)
>> >> > at
>> >> >
>> >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> >> thodAccessorImpl.java:43)
>> >> > at java.lang.reflect.Method.invoke(Method.java:498)
>> >> > at
>> >> >
>> >> > org.apache.calcite.util.ReflectUtil.invokeVisitorInternal(Re
>> >> flectUtil.java:257)
>> >> > at org.apache.calcite.util.ReflectUtil.invokeVisitor(ReflectUti
>> >> l.java:214)
>> >> > at
>> >> > org.apache.calcite.util.ReflectUtil$1.invokeVisitor(ReflectU
>> >> til.java:464)
>> >> > at
>> >> >
>> >> > org.apache.calcite.sql2rel.RelStructuredTypeFlattener$Rewrit
>> >> eRelVisitor.visit(RelStructuredTypeFlattener.java:721)
>> >> > at
>> >> >
>> >> > org.apache.calcite.sql2rel.RelStructuredTypeFlattener.rewrit
>> >> e(RelStructuredTypeFlattener.java:177)
>> >> > at
>> >> >
>> >> > org.apache.calcite.sql2rel.SqlToRelConverter.flattenTypes(Sq
>> >> lToRelConverter.java:462)
>> >> > at
>> >> >
>> >> > org.apache.calcite.test.SqlToRelTestBase$TesterImpl.convertS
>> >> qlToRel(SqlToRelTestBase.java:585)
>> >> > at
>> >> >
>> >> > org.apache.calcite.test.SqlToRelTestBase$TesterImpl.assertCo
>> >> nvertsTo(SqlToRelTestBase.java:690)
>> >> > at
>> >> >
>> >> > org.apache.calcite.test.SqlToRelConverterTest$Sql.convertsTo
>> >> (SqlToRelConverterTest.java:2784)
>> >> > at
>> >> >
>> >> > org.apache.calcite.test.SqlToRelConverterTest$Sql.ok(SqlToRe
>> >> lConverterTest.java:2776)
>> >> > at
>> >> >
>> >> > org.apache.calcite.test.SqlToRelConverterTest.testArrayOfRec
>> >> ord(SqlToRelConverterTest.java:1059)
>> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >> > at
>> >> >
>> >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> >> ssorImpl.java:62)
>> >> > at
>> >> >
>> >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> >> thodAccessorImpl.java:43)
>> >> > at java.lang.reflect.Method.invoke(Method.java:498)
>> >> > at
>> >> >
>> >> > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
>> >> FrameworkMethod.java:50)
>> >> > at
>> >> >
>> >> > org.junit.internal.runners.model.ReflectiveCallable.run(Refl
>> >> ectiveCallable.java:12)
>> >> > at
>> >> >
>> >> > org.junit.runners.model.FrameworkMethod.invokeExplosively(Fr
>> >> ameworkMethod.java:47)
>> >> > at
>> >> >
>> >> > org.junit.internal.runners.statements.InvokeMethod.evaluate(
>> >> InvokeMethod.java:17)
>> >> > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>> >> > at
>> >> >
>> >> > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
>> >> 4ClassRunner.java:78)
>> >> > at
>> >> >
>> >> > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
>> >> 4ClassRunner.java:57)
>> >> > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>> >> > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>> >> > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>> >> > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>> >> > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>> >> > at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>> >> > at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>> >> > at
>> >> >
>> >> > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs
>> >> (JUnit4IdeaTestRunner.java:117)
>> >> > at
>> >> >
>> >> > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs
>> >> (JUnit4IdeaTestRunner.java:42)
>> >> > at
>> >> >
>> >> > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsA
>> >> ndStart(JUnitStarter.java:262)
>> >> > at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStart
>> >> er.java:84)
>> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >> > at
>> >> >
>> >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> >> ssorImpl.java:62)
>> >> > at
>> >> >
>> >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> >> thodAccessorImpl.java:43)
>> >> > at java.lang.reflect.Method.invoke(Method.java:498)
>> >> > at
>> com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
>> >> >
>> >> > Shuyi
>> >> >
>> >> >
>> >> > On Wed, Mar 21, 2018 at 6:09 PM, Shuyi Chen <suez1224@gmail.com>
>> wrote:
>> >> >
>> >> > > Thanks a lot, Anton. This seems to be a bug in Calcite. When the
>> >> > statement
>> >> > > involving record types, sql validation seems to work, but the
rel
>> plan
>> >> > > generated might be wrong.  I can also reproduce your case:
>> >> > >
>> >> > > MockTable deptNestedTable =
>> >> > >     MockTable.create(this, salesSchema, "DEPT_NESTED", false,
4);
>> >> > > deptNestedTable.addColumn("DEPTNO", f.intType, true);
>> >> > > deptNestedTable.addColumn("NAME", f.varchar10Type);
>> >> > > deptNestedTable.addColumn("SKILLRECORD", f.skillRecordType);
>> >> > > deptNestedTable.addColumn("EMPLOYEES", f.empListType);
>> >> > > registerTable(deptNestedTable);
>> >> > >
>> >> > > Run the following test:
>> >> > >
>> >> > > @Test public void testArrayOfRecord() {
>> >> > >   sql("select skillrecord, employees from dept_nested").ok();
>> >> > > }
>> >> > >
>> >> > > yield:
>> >> > > LogicalProject(SKILLRECORD=[$0], EMPLOYEES=[$1])
>> >> > >   LogicalProject(SKILLRECORD=[$2], SKILLRECORD1=[$3],
>> EMPLOYEES=[$4])
>> >> > >     LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE],
>> >> > DESC=[$2.DESC],
>> >> > > EMPLOYEES=[$3])
>> >> > >       LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
>> >> > >
>> >> > > Sometimes, it works:
>> >> > >
>> >> > > @Test public void testArrayOfRecord() {
>> >> > > sql("select name, employees from dept_nested").ok();
>> >> > > }
>> >> > >
>> >> > > yield:
>> >> > >
>> >> > > LogicalProject(NAME=[$1], EMPLOYEES=[$4])
>> >> > >   LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE],
>> >> DESC=[$2.DESC],
>> >> > > EMPLOYEES=[$3])
>> >> > >     LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
>> >> > >
>> >> > > I can take a deeper look.
>> >> > >
>> >> > > Shuyi
>> >> > >
>> >> > > On Wed, Mar 21, 2018 at 11:06 AM, Anton Kedin
>> >> <kedin@google.com.invalid>
>> >> > > wrote:
>> >> > >
>> >> > >> Hi,
>> >> > >>
>> >> > >> I have an issue I am not sure how to handle, would appreciate
any
>> >> > >> pointers.
>> >> > >>
>> >> > >> I have a table with row type:
>> >> > >> RecordType(
>> >> > >>     INTEGER orderId,
>> >> > >>     RecordType(VARCHAR name, INTEGER personId)
>> >> > >>         person,
>> >> > >>     RecordType(VARCHAR sku, INTEGER price, VARCHAR currency,
>> VARCHAR
>> >> > ARRAY
>> >> > >> tags)
>> >> > >>         ARRAY items
>> >> > >> )
>> >> > >>
>> >> > >> With this row type I am trying to model a JSON object which
looks
>> >> like
>> >> > >> this:
>> >> > >> { "orderId" : 1,
>> >> > >>   "person" : { "name" : "john", "personId" : 12, },
>> >> > >>   "items": [
>> >> > >>     { "sku" : "aaa01", "price" : 12, "currency" : "USD", "tags"
:
>> >> > ["blue",
>> >> > >> "book"] }
>> >> > >>   ]}
>> >> > >>
>> >> > >> When selecting the whole items array I get the following plan:
>> >> > >> SELECT items FROM PCOLLECTION
>> >> > >>
>> >> > >> LogicalProject(items=[$3])
>> >> > >>   LogicalProject(orderId=[$0], name=[$1.name],
>> >> personId=[$1.personId],
>> >> > >> items
>> >> > >> =[$2])
>> >> > >>     LogicalTableScan(table=[[PCOLLECTION]])
>> >> > >>
>> >> > >> Which looks correct and it works. One thing to note here is
that
>> >> Calcite
>> >> > >> flattens the person row, and makes the input ref for the items
>> field
>> >> as
>> >> > >> $3,
>> >> > >> as expected.
>> >> > >>
>> >> > >> But when I want to get a specific element from that array
I get
>> the
>> >> > >> following:
>> >> > >> SELECT items[0] FROM PCOLLECTION
>> >> > >>
>> >> > >> LogicalProject(EXPR$0$0=[ITEM($2, 0).sku], EXPR$0$1=[ITEM($2,
>> >> 0).price],
>> >> > >> EXPR$0$2=[ITEM($2, 0).currency], EXPR$0$3=[ITEM($2, 0).tags])
>> >> > >>   LogicalProject(orderId=[$0], name=[$1.name],
>> >> personId=[$1.personId],
>> >> > >> items
>> >> > >> =[$2])
>> >> > >>     LogicalTableScan(table=[[PCOLLECTION]])
>> >> > >>
>> >> > >> The first project looks the same. Flattened person row, items
>> array,
>> >> all
>> >> > >> looks similar to the above.
>> >> > >> But the outer project calls ITEM($2, i). I would expect it
to be
>> >> > >> ITEM($3, i) instead,
>> >> > >> to adjust for the flattened person row, but it keeps the index
as
>> $2,
>> >> > >> which
>> >> > >> would have been the correct index if the row was not flattened,
>> but
>> >> it
>> >> > >> should be $3 for flattened row, similar to the previous example.
>> >> > >>
>> >> > >> Is there something I am missing or is it a bug and Calcite
should
>> >> adjust
>> >> > >> the input ref index to account for flattened rows in this
case as
>> >> well?
>> >> > >>
>> >> > >> Thank you,
>> >> > >> Anton
>> >> > >>
>> >> > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > "So you have to trust that the dots will somehow connect in your
>> >> future."
>> >> > >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > "So you have to trust that the dots will somehow connect in your
>> >> future."
>> >> >
>> >>
>> >
>> >
>> >
>> > --
>> > "So you have to trust that the dots will somehow connect in your
>> future."
>> >
>>
>>
>>
>> --
>> "So you have to trust that the dots will somehow connect in your future."
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message