phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carter Shanklin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-1145) Ensure CHAR type meets SQL standard
Date Tue, 05 Aug 2014 20:00:13 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086664#comment-14086664
] 

Carter Shanklin commented on PHOENIX-1145:
------------------------------------------

Hive introduced CHAR recently and adopted semantics where it is treated as if it's padded,
this is by far the most common approach. VARCHAR semantics are lot more divided.

Some important nuances were:
When CHAR is used as a grouping key:
{code}
 'A ' and 'A'
{code}
should be considered the same key for grouping when they are CHARs.

As for IN lists, a statement like
{code}
'A ' IN ('A', 'B', 'C')
{code}
should evaluate as true. The same should be true if the space is in the IN list.

If these things work you've covered most of the gotchas you're likely to run into.

> Ensure CHAR type meets SQL standard
> -----------------------------------
>
>                 Key: PHOENIX-1145
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1145
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>
> Phoenix pads CHAR type values with space characters. This can lead to odd results if
a character smaller than a space is used. In addition, our comparison logic ignores trailing
spaces, but there may be issues with this approach. See this Postgres thread: http://postgresql.1045698.n5.nabble.com/String-comparison-and-the-SQL-standard-td5740721.html
> In addition, CHAR only supports single byte characters as we assume that each CHAR is
a one byte to calculate offsets.
> We should investigate making CHAR more conformant.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message