phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Artur Tamazian (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PHOENIX-5200) LIMIT+OFFSET works incorrectly when querying by index fields only and using DISTINCT
Date Fri, 15 Mar 2019 19:37:00 GMT
Artur Tamazian created PHOENIX-5200:
---------------------------------------

             Summary: LIMIT+OFFSET works incorrectly when querying by index fields only and
using DISTINCT
                 Key: PHOENIX-5200
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5200
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.13.2-cdh
            Reporter: Artur Tamazian


Example:

{code}

drop table test

create table test (
pk integer not null primary key,
index_column1 integer,
index_column2 integer,
col3 integer
);

create index if not exists test_index on WATSON_TEXT_MESSAGE.test (index_column1, index_column2)
include (col3);

upsert into test (pk, index_column1, index_column2, col3) values (1, 1, 1, 1);
upsert into test (pk, index_column1, index_column2, col3) values (2, 2, 2, 2);
upsert into test (pk, index_column1, index_column2, col3) values (3, 3, 3, 3);
upsert into test (pk, index_column1, index_column2, col3) values (4, 4, 4, 4);

-- this case works correctly

-- index is used but filtering by non indexed column too
-- returns two values: 2 and 3
select distinct index_column1 from test where index_column1 > 0 and index_column2 >
0 and col3 > 0 limit 2 offset 1



-- this query should return the same, but it doesn't. The only difference is: non indexed
column is not used in where clause

-- returns one value: 2

select distinct index_column1 from test where index_column1 > 0 and index_column2 >
0 limit 2 offset 1

{code}



If index is removed both queries return correct results.

Execution plans show what the problem is. Correct query plan:



{code}

CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TEST
SERVER FILTER BY (INDEX_COLUMN1 > 0 AND INDEX_COLUMN2 > 0 AND COL3 > 0)
SERVER AGGREGATE INTO DISTINCT ROWS BY [INDEX_COLUMN1] LIMIT 3 GROUPS
CLIENT MERGE SORT
CLIENT OFFSET 1
CLIENT 2 ROW LIMIT

{code}



Incorrect query plan:

{code}

CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER    TEST_INDEX [0] - [*]
SERVER FILTER BY FIRST KEY ONLY AND TO_INTEGER("INDEX_COLUMN2") > 0
SERVER DISTINCT PREFIX FILTER OVER ["INDEX_COLUMN1"]
SERVER 2 ROW LIMIT
SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY ["INDEX_COLUMN1"] LIMIT 3 GROUPS
CLIENT OFFSET 1
CLIENT 2 ROW LIMIT

{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message