hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Wohlstadter (JIRA)" <>
Subject [jira] [Commented] (HIVE-19723) Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"
Date Fri, 01 Jun 2018 21:20:00 GMT


Eric Wohlstadter commented on HIVE-19723:


The Arrow serializer appears to truncate down to MILLISECONDS, but the Jira description

This is motivated by {{org.apache.spark.sql.execution.arrow.ArrowUtils.scala}}
case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND => TimestampType{code}

My understanding is that since the primary use-case for {{ArrowUtils}} is Python integration,
some of the conversions are currently somewhat particular for Python. Perhaps Python/Pandas
only supports MICROSECOND timestamps. 

FYI: [~hyukjin.kwon] [~bryanc]

> Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"
> -----------------------------------------------------------------
>                 Key: HIVE-19723
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Teddy Choi
>            Assignee: Teddy Choi
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.1.0, 4.0.0
>         Attachments: HIVE-19723.1.patch, HIVE-19732.2.patch
> Spark's Arrow support only provides Timestamp at MICROSECOND granularity. Spark 2.3.0
won't accept NANOSECOND. Switch it back to MICROSECOND.
> The unit test org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow will just need to change
the assertion to test microsecond. And we'll need to add this to documentation on supported

This message was sent by Atlassian JIRA

View raw message