hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <>
Subject [jira] [Commented] (HIVE-10286) SARGs: Type Safety via PredicateLeaf.type
Date Thu, 09 Apr 2015 22:26:12 GMT


Gopal V commented on HIVE-10286:

Here are some of the corner cases I noticed with the raw types.

For instance a Decimal column with values (min : 9.0, max : 99.0) when searched with "11"
(as string) should not do min.toString(), max.toString() for any comparisons (particularly
for -ve numbers).

The missing type conversions for any predicate literal can cause hash misses in the bloom

While evaluating <double-col> IN (1, 2.2), you cannot use the Long hashCode for the
first predicate when checking the bloom filter.

When evaluating Date and Timestamp columns, while they might be stored as Long inside the
stats implementation, they hold vastly different semantic data - days since 1970 & milliseconds
since 1970.

Those are a few of the corner cases to test/confirm.

The good part about all these is that we can return YES_NO_NULL for all cases where we can't
produce any reduction.

> SARGs: Type Safety via PredicateLeaf.type
> -----------------------------------------
>                 Key: HIVE-10286
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: File Formats, Serializers/Deserializers
>            Reporter: Gopal V
>            Assignee: Prasanth Jayachandran
> The Sargs impl today converts the statsObj to the type of the predicate object before
doing any comparisons.
> To satisfy the PPD requirements, the conversion has to be coerced to the type specified
in PredicateLeaf.type.
> The type conversions in Hive are standard and have a fixed promotion order.
> Therefore the PredicateLeaf has to do type changes which match the exact order of type
coercions offered by the FilterOperator.

This message was sent by Atlassian JIRA

View raw message