[ https://issues.apache.org/jira/browse/DRILL-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661699#comment-13661699
]
Julian Hyde commented on DRILL-61:
----------------------------------
Let me know when this bug is fixed. I have checked in functionality to implement GROUP BY
etc. [ see https://github.com/julianhyde/incubator-drill/commit/febcb702c224f8aa148377272eb88929cfa15ee7
] but the tests are disabled pending this bug.
> Logical plan operator "collapsesegment" produces wrong results
> --------------------------------------------------------------
>
> Key: DRILL-61
> URL: https://issues.apache.org/jira/browse/DRILL-61
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Julian Hyde
>
> Logical plan operator "collapsesegment" produces wrong results. There is a null value
present -- maybe it is responsible.
> Query:
> {
> "head" : {
> "type" : "apache_drill_logical_plan",
> "version" : 1,
> "generator" : {
> "type" : "manual",
> "info" : "na"
> }
> },
> "storage" : [ {
> "type" : "queue",
> "name" : "queue"
> }, {
> "type" : "classpath",
> "name" : "donuts-json"
> } ],
> "query" : [ {
> "op" : "scan",
> "@id" : 1,
> "memo" : "initial_scan",
> "storageengine" : "donuts-json",
> "selection" : {
> "path" : "/employees.json",
> "type" : "JSON"
> },
> "ref" : "_MAP"
> }, {
> "op" : "project",
> "input" : 1,
> "@id" : 2,
> "projections" : [ {
> "ref" : "output.deptId",
> "expr" : "_MAP.deptId"
> } ]
> }, {
> op: "segment",
> "input" : 2,
> "@id" : 3,
> ref: "segment",
> exprs: ["deptId"]
> }, {
> "input" : 3,
> "@id" : 4,
> op: "collapsingaggregate",
> within: "segment",
> carryovers: [ "deptId" ],
> aggregations: [
> { ref: "typeCount", expr: "count(1)" }
> ]
> }, {
> "op" : "store",
> "input" : 4,
> "@id" : 5,
> "memo" : "output sink",
> "target" : {
> "number" : 0
> },
> "partition" : null,
> "storageEngine" : "queue"
> } ]
> }
> gives result
> { "typeCount" : 2, "deptId" : 34 }
> { "typeCount" : 2, "deptId" : null }
> { "typeCount" : 1, "deptId" : 31 }
> { "typeCount" : 1, "deptId" : 31 }
> I think the correct result would be
> { "typeCount" : 2, "deptId" : 33 }
> { "typeCount" : 2, "deptId" : 34 }
> { "typeCount" : 1, "deptId" : null }
> { "typeCount" : 1, "deptId" : 31 }
> Note that the "segment" operator is working correctly. A similar query with "collapseaggregate"
removed:
> {
> "head" : {
> "type" : "apache_drill_logical_plan",
> "version" : 1,
> "generator" : {
> "type" : "manual",
> "info" : "na"
> }
> },
> "storage" : [ {
> "type" : "queue",
> "name" : "queue"
> }, {
> "type" : "classpath",
> "name" : "donuts-json"
> } ],
> "query" : [ {
> "op" : "scan",
> "@id" : 1,
> "memo" : "initial_scan",
> "storageengine" : "donuts-json",
> "selection" : {
> "path" : "/employees.json",
> "type" : "JSON"
> },
> "ref" : "_MAP"
> }, {
> "op" : "project",
> "input" : 1,
> "@id" : 2,
> "projections" : [ {
> "ref" : "output.deptId",
> "expr" : "_MAP.deptId"
> } ]
> }, {
> op: "segment",
> "input" : 2,
> "@id" : 3,
> ref: "segment",
> exprs: ["deptId"]
> }, {
> "op" : "store",
> "input" : 3,
> "@id" : 5,
> "memo" : "output sink",
> "target" : {
> "number" : 0
> },
> "partition" : null,
> "storageEngine" : "queue"
> } ]
> }
> gives
> { "segment" : 1, "deptId" : 33 }
> { "segment" : 1, "deptId" : 33 }
> { "segment" : 2, "deptId" : 34 }
> { "segment" : 2, "deptId" : 34 }
> { "segment" : 3, "deptId" : null }
> { "segment" : 4, "deptId" : 31 }
> It is reasonsble to assume that these are the records flowing into the "collapseaggregate"
ROP in the first query.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
|