hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junjie Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.
Date Fri, 13 Jul 2018 03:23:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542455#comment-16542455
] 

Junjie Chen commented on HIVE-17593:
------------------------------------

The previous unit test failure (vectorized_parquet_types.q) is because of different length
UDF used for CHAR.  

When performing query in non-vectorized mode, GenericUDFLength is used to calculate length
of column, it converts the primitive value to string by using PrimitiveObjectInspectorUtil.getString,
in which the tailing spaces is ignored for CHAR type.
However, when performing query in vectorized mode, StringLength is used to calculate the length
of column, it treats column as byte array and doesn't consider the column type. 

> DataWritableWriter strip spaces for CHAR type before writing, but predicate generator
doesn't do same thing.
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-17593
>                 URL: https://issues.apache.org/jira/browse/HIVE-17593
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.3.0, 3.0.0
>            Reporter: Junjie Chen
>            Assignee: Junjie Chen
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when generating predicate,
it does NOT do same striping which should cause data missing!
> In current version, it doesn't cause data missing since predicate is not well push down
to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as same which
will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message