phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <>
Subject [jira] [Commented] (PHOENIX-852) Optimize child/parent foreign key joins
Date Tue, 26 Aug 2014 18:03:59 GMT


James Taylor commented on PHOENIX-852:

My top level point is to make sure we have a common API in WhereOptimizer as otherwise it'll
be difficult to maintain.

Just refactor WhereOptimizer.pushKeyExpressionsToScan() to return the info you need. The main
thing it produces is a ScanRanges which will give you everything you need. If ScanRanges.useSkipScan()
is false, then you won't/can't do the optimization.

My first thought is to have something like this:

    public static ScanRanges pushKeyExpressionsToScan(StatementContext context, FilterableStatement
            Expression whereClause, Set<Expression> extractNodes) {

This function wouldn't call context.setScanRanges(), but just return it instead.  And it wouldn't
do this bit at the end:
        if (whereClause == null) {
            return null;
        } else {
            return whereClause.accept(new RemoveExtractedNodesVisitor(extractNodes));

Instead you'd maintain a different function with the original signature that would do this
        if (whereClause == null) {
            return null;
        } else {
            return whereClause.accept(new RemoveExtractedNodesVisitor(extractNodes));
and change ScanRanges to pass in the minMaxRange in the constructor instead and context.setScanRanges()
can access it from ScanRanges instead.

> Optimize child/parent foreign key joins
> ---------------------------------------
>                 Key: PHOENIX-852
>                 URL:
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Assignee: Maryann Xue
>         Attachments: 852-2.patch, 852.patch, PHOENIX-852.patch
> Often times a join will occur from a child to a parent. Our current algorithm would do
a full scan of one side or the other. We can do much better than that if the HashCache contains
the PK (or even part of the PK) from the table being joined to. In these cases, we should
drive the second scan through a skip scan on the server side.

This message was sent by Atlassian JIRA

View raw message