hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt McCline (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-20044) Arrow Serde should pad char values and handle empty strings correctly
Date Wed, 01 Aug 2018 00:17:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-20044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564539#comment-16564539
] 

Matt McCline commented on HIVE-20044:
-------------------------------------

[~teddy.choi] [~ewohlstadter] Second thoughts...

Do you need to write a variation of StringExpr.padRight that accounts for Unicode?  Do other
any other parts of your Arrow SerDe need to consider Unicode character length vs. byte length?

> Arrow Serde should pad char values and handle empty strings correctly
> ---------------------------------------------------------------------
>
>                 Key: HIVE-20044
>                 URL: https://issues.apache.org/jira/browse/HIVE-20044
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Teddy Choi
>            Assignee: Teddy Choi
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-20044.1.branch-3.patch, HIVE-20044.1.patch, HIVE-20044.1.patch,
HIVE-20044.patch
>
>
> When Arrow Serde serializes char values, it loses padding. Also when it counts empty
strings, sometimes it makes a smaller number. It should pad char values and handle empty strings
correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message