phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-995) ADD ENCODE AND LPAD functions
Date Tue, 03 Jun 2014 00:52:01 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016098#comment-14016098
] 

James Taylor commented on PHOENIX-995:
--------------------------------------

These are looking very good, [~tdsilva]. Thanks so much for the contributions.

bq.  was not able to understand how the preservesOrder() is supposed to be implemented. Does
OrderPreserving.YES mean that if inputs to the function are ordered in a particular way, applying
the function will not re-order the outputs wrt to inputs? However, can they be sorted differently,
for eg INVERT has OrderPreserving.YES, even thought it inverts the bits of the input?
Yes, exactly what you said: if inputs to the function are ordered in a particular way, applying
the function will not re-order the outputs wrt to inputs. This is irregardless of the SortOrder,
though, that's why INVERT is able to still return OrderPreserving.YES. It's basically a determination
of whether or not the rows need to be sorted or not. The ASC/DESC option in ORDER BY takes
into account whether or not the SortOrder matches.

I believe LPAD can be OrderPreserving.YES as long as the amount of padding being applied is
constant (isStateless & isDeterministic are both true). I'm not sure about your ENCODE
function. A BIGINT sorts naturally with its value. Does a base62 encoded BIGINT sort the same
way?

The getKeyFormationTraversalIndex() is a way for a built-in function to define how it interacts
with the formation of the start key/stop key when a row key column is used as an argument.
An example would be an expression like: s LIKE 'a%'. In this case, we'd know that, assuming
s is the leading PK column, that the start key would be 'a' and the stop key would 'b' (exclusive).
The getKeyFormationTraversalIndex() allows these kinds of optimizations to be expressed (as
opposed to falling back to a full table scan).

If you don't envision LPAD or ENCODE(num,'base62') to be used in a WHERE clause, it's kind
of moot in which case you can just return NO_TRAVERSAL. If, on the other hand, you think they'll
be expression like WHERE ENCODE(num,'base62') = 'abcdefg', then it might make sense to implement
it. Given that ENCODE will be used more as a key generator, this seems unlikely, so I'd advise
to just start with NO_TRAVERSAL.

> ADD ENCODE AND LPAD functions 
> ------------------------------
>
>                 Key: PHOENIX-995
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-995
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Thomas D'Silva
>         Attachments: PHOENIX-995.patch
>
>
> Add ENCODE(input number, format encodeformat) which can be used to convert a base 10
number to a  base 62 number
> Add LPAD(input string, length int [, fill string]) which can be used to left pad an input
string. 
> Together these two functions can be used to generate IDs using sequences, for example:
> {code:sql}
> CREATE SEQUENCE foo.bar START WITH 0 INCREMENT BY 62
> SELECT LPAD(ENCODE(NEXT VALUE FOR foo.bar,'BASE62'), 10,'0') FROM SYSTEM."SEQUENCE"
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message