phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <>
Subject [jira] [Updated] (PHOENIX-2067) Sort order incorrect for variable length DESC columns
Date Fri, 03 Jul 2015 21:52:04 GMT


James Taylor updated PHOENIX-2067:
    Attachment: PHOENIX-2067-wip3.patch

More WIP. With just a few test failures, but no upgrade or conditional optimization for existing
data. This is with nulls last when DESC, but there's a problem with this - we'd need to include
trailing nulls until the last DESC row key column and you wouldn't be able to add a new DESC
row key column without mucking with the data (which is a showstopper).

I'm going to instead use a null separator with DESC for null values and otherwise a 0xFF.
That way, nulls will sort first for ASC and DESC, but DESC sort order will work for all values.

> Sort order incorrect for variable length DESC columns
> -----------------------------------------------------
>                 Key: PHOENIX-2067
>                 URL:
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.4.0
>         Environment: HBase 0.98.6-cdh5.3.0
> jdk1.7.0_67 x64
> CentOS release 6.4 (2.6.32-358.el6.x86_64)
>            Reporter: Mykola Komarnytskyy
>            Assignee: James Taylor
>         Attachments: PHOENIX-2067-wip.patch, PHOENIX-2067-wip2.patch, PHOENIX-2067-wip3.patch
> Steps to reproduce:
> 1. Create a table: 
> CREATE TABLE mytable (id BIGINT not null PRIMARY KEY, timestamp BIGINT, log_message varchar)
> 2. Create two indexes:
> CREATE INDEX mytable_index_search ON mytable(timestamp,id) INCLUDE (log_message) SALT_BUCKETS=16;
> CREATE INDEX mytable_index_search_desc ON mytable(timestamp DESC,id DESC) INCLUDE (log_message)
> 3. Upsert values:
> UPSERT INTO mytable VALUES(1, 1434983826018, 'message1');
> UPSERT INTO mytable VALUES(2, 1434983826100, 'message2');
> UPSERT INTO mytable VALUES(3, 1434983826101, 'message3');
> UPSERT INTO mytable VALUES(4, 1434983826202, 'message4');
> 4. Sort DESC by timestamp:
> select timestamp,id,log_message from mytable ORDER BY timestamp DESC;
> Failure: data is sorted incorrectly. In case when we have two longs which  are different
only by last two digits (e.g. 1434983826155, 1434983826100)  and one of the long ends with
'00' we receive incorrect order. 
> Sorting result:
> 1434983826202
> 1434983826100
> 1434983826101
> 1434983826018

This message was sent by Atlassian JIRA

View raw message