hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt McCline (JIRA)" <>
Subject [jira] [Comment Edited] (HIVE-15335) Fast Decimal
Date Fri, 16 Dec 2016 16:09:58 GMT


Matt McCline edited comment on HIVE-15335 at 12/16/16 4:09 PM:

Given that the ColumnVector family exposes public members (i.e. vector) of its classes, clients
are at the compilation level.  They need to recompile each release.

If a client uses ORC to read vectorized (VectorizedRowBatch) then what use to be a internal
non-shared data structure is now public.  What a mess.

I think the answer very well may be Hive and ORC are going to have to stay linked together.
 We release them together.  The new feature is you can use ORC to read ORC files and don’t
need to invoke Hive.   But ORC isn’t a fully separate project that can release separately.
 It always has to release with its parent.

was (Author: mmccline):
I am very clear I do not want to support 3rd parties writing vectorized UDFs on our current
data structures and without any design thought whatsoever being given to it.  I've always
considered the vector classes to be internal non-shared data structures.  Certainly not public

And, clearly an alternative to examine is to say early versions of ORC using old HIveDecimal
are compatible with an early range of Hive; and newer ORC versions are only compatible with
newer Hive versions.

> Fast Decimal
> ------------
>                 Key: HIVE-15335
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: HIVE-15335.01.patch, HIVE-15335.02.patch, HIVE-15335.03.patch, HIVE-15335.04.patch,
HIVE-15335.05.patch, HIVE-15335.06.patch, HIVE-15335.07.patch, HIVE-15335.08.patch, HIVE-15335.09.patch,
HIVE-15335.091.patch, HIVE-15335.092.patch, HIVE-15335.093.patch, HIVE-15335.094.patch, HIVE-15335.095.patch
> Replace HiveDecimal implementation that currently represents the decimal internally as
a BigDecimal with a faster version that does not allocate extra objects
> Replace HiveDecimalWritable implementation with a faster version that has new mutable*
calls (e.g. mutableAdd, mutableEnforcePrecisionScale, etc) and stores the result as a fast
decimal instead of a slow byte array containing a serialized BigInteger.
> Provide faster ways to serialize/deserialize decimals.

This message was sent by Atlassian JIRA

View raw message