phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-852) Optimize child/parent foreign key joins
Date Thu, 28 Aug 2014 05:10:57 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113346#comment-14113346
] 

James Taylor commented on PHOENIX-852:
--------------------------------------

For ScanRanges, I'd create a new method like the following:
{code}
public int getPkColumnSpan() {
    return ScanUtil.calculateSlotSpan(ranges, slotSpan);
}
{code}
This will return the number of PK columns that have been optimized by the WhereOptimizer.

For the constants, you just need to use any valid value given the PDataType. If you want,
feel free to add a PDataType.getSampleValue() method that'll return that. You'd need to create
a where clause with an expression like (joinKeyCol1, joinKeyCol2, ...) IN ((sampleVal1, sampleVal2,
...),(sampleVal1, sampleVal2)). I think you'll need to use two values in the IN clause or
it'll compile it to an = expression instead.


> Optimize child/parent foreign key joins
> ---------------------------------------
>
>                 Key: PHOENIX-852
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-852
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Assignee: Maryann Xue
>         Attachments: 852-2.patch, 852.patch, PHOENIX-852.patch
>
>
> Often times a join will occur from a child to a parent. Our current algorithm would do
a full scan of one side or the other. We can do much better than that if the HashCache contains
the PK (or even part of the PK) from the table being joined to. In these cases, we should
drive the second scan through a skip scan on the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message