hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Sankar Sivarama Subramaniyan (JIRA)" <>
Subject [jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
Date Fri, 13 May 2016 00:44:13 GMT


Hari Sankar Sivarama Subramaniyan commented on HIVE-13708:

[~thejas] I checked whether we could do this in a generic way. As you mentioned, we can perform
a deep check of the object inspector after initialize() and see if the types will match the
column type in the table definition.  My concern here is if it is backward compatible or will
it break things that used to work previously. If we haven't enforced this rule previously,
how will we expect the custom serde developer henceforth to know that this is an enforced
rule in Hive. Also, it looked cleaner to implement this check in the actual serde itself (like
for e.g. RegexSerDe has done a similar check in initialize()) since it seems that it is the
responsibility of the Serde to interpret the data correctly and not the query processor. Let
me know your feedback.


> Create table should verify datatypes supported by the serde
> -----------------------------------------------------------
>                 Key: HIVE-13708
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>            Reporter: Thejas M Nair
>            Assignee: Hari Sankar Sivarama Subramaniyan
>            Priority: Critical
>         Attachments: HIVE-13708.1.patch
> As [~Goldshuv] mentioned in HIVE-7777.
> Create table with serde such as OpenCSVSerde allows for creation of table with columns
of arbitrary types. But 'describe table' would still return string datatypes, and so does
selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with unsupported
> Example posted by [~Goldshuv] in HIVE-7777 -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with 
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") 
> LOCATION '<some location>' 
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the actual result
became 100001.57 (as it is first according to byte ordering of a string type). this is a wrong
> hive> desc extended test;
> OK
> o_totalprice        	string              	from deserializer
> ...

This message was sent by Atlassian JIRA

View raw message