drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (DRILL-64) In reference implementation, full outer join returns too many rows
Date Thu, 23 May 2013 16:04:19 GMT

     [ https://issues.apache.org/jira/browse/DRILL-64?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jacques Nadeau resolved DRILL-64.
---------------------------------

    Resolution: Fixed
      Assignee: Timothy Chen

Fixed in changeset d162e13383f4001ab5e6e4b9d27288e0dfaee22d
                
> In reference implementation, full outer join returns too many rows
> ------------------------------------------------------------------
>
>                 Key: DRILL-64
>                 URL: https://issues.apache.org/jira/browse/DRILL-64
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Assignee: Timothy Chen
>            Priority: Minor
>
> Full outer join returns too many rows. The query
> {
>   "head" : {
>     "type" : "apache_drill_logical_plan",
>     "version" : 1,
>     "generator" : {
>       "type" : "manual",
>       "info" : "na"
>     }
>   },
>   "storage" : [ {
>     "type" : "queue",
>     "name" : "queue"
>   }, {
>     "type" : "classpath",
>     "name" : "donuts-json"
>   } ],
>   "query" : [ {
>     "op" : "scan",
>     "@id" : 1,
>     "memo" : "initial_scan",
>     "storageengine" : "donuts-json",
>     "selection" : {
>       "path" : "/employees.json",
>       "type" : "JSON"
>     },
>     "ref" : "_MAP"
>   }, {
>     "op" : "project",
>     "input" : 1,
>     "@id" : 2,
>     "projections" : [ {
>       "ref" : "output.deptId",
>       "expr" : "_MAP.deptId"
>     } ]
>   }, {
>     "op" : "project",
>     "input" : 2,
>     "@id" : 10,
>     "projections" : [ {
>       "ref" : "output.deptId1",
>       "expr" : "deptId"
>     } ]
>   }, {
>     "op" : "scan",
>     "@id" : 3,
>     "memo" : "initial_scan",
>     "storageengine" : "donuts-json",
>     "selection" : {
>       "path" : "/departments.json",
>       "type" : "JSON"
>     },
>     "ref" : "_MAP"
>   }, {
>     "op" : "project",
>     "input" : 3,
>     "@id" : 4,
>     "projections" : [ {
>       "ref" : "output.deptId",
>       "expr" : "_MAP.deptId"
>     } ]
>   },  {
>     "op": "join",
>     "left": 10,
>     "right": 4,
>     "@id" : 5,
>     "type": "outer",
>     "conditions": [
>       {"relationship": "==", "left": "deptId1", "right": "deptId"}
>     ]
>   }, {
>     "op" : "store",
>     "input" : 5,
>     "@id" : 6,
>     "memo" : "output sink",
>     "target" : {
>       "number" : 0
>     },
>     "partition" : null,
>     "storageEngine" : "queue"
>   } ]
> }
> returns 
> {  "deptId" : 31,  "deptId1" : 31} 
> {  "deptId" : 33,  "deptId1" : 33} 
> {  "deptId" : 33,  "deptId1" : 33} 
> {  "deptId" : 34,  "deptId1" : 34} 
> {  "deptId" : 34,  "deptId1" : 34} 
> {  "deptId1" : null} 
> {  "deptId" : 31} 
> {  "deptId" : 33} 
> {  "deptId" : 34} 
> {  "deptId" : 35} 
> but I think it should return
> {  "deptId" : 31,  "deptId1" : 31} 
> {  "deptId" : 33,  "deptId1" : 33} 
> {  "deptId" : 33,  "deptId1" : 33} 
> {  "deptId" : 34,  "deptId1" : 34} 
> {  "deptId" : 34,  "deptId1" : 34} 
> {  "deptId1" : null} 
> {  "deptId" : 35} 
> because only 35 on the left is unmatched on the right; {31, 33, 34} have at least one
match.
> Also, it's cosmetic, but the "outer" join type should be called "full" because "left"
is also a kind of outer join, at least in SQL parlance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message