hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Maughan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-15328) Inconsistent/incorrect handling of NULL in nested structs
Date Thu, 01 Dec 2016 12:57:58 GMT

     [ https://issues.apache.org/jira/browse/HIVE-15328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

David Maughan updated HIVE-15328:
---------------------------------
    Description: 
h2. Overview

Performing {{IS NULL}} checks against a null struct that is generated as part of a UDF correctly
returns {{true}}. However, the same check against the same null struct that has been persisted
to a table incorrectly returns {{false}}. Additionally, when a child field of the null struct
is inspected in the same query, the result of the null check on the struct itself reverses
itself to {{true}}.

The issue does not appear to be dependent on the storage format of the table as the same result
is repeated with TEXTFILE, PARQUET, ORC and AVRO.

h2. Example

In this example I have used {{if(1=1, null, named_struct('c', 1))}} as a crude method of generating
a simple null struct.

h4. 'b' is correctly reported as {{true}}.
{code}
hive> select
    >   b is null,
    >   b
    > from (
    >   select
    >     if(1=1, null, named_struct('c', 1)) as b
    >   ) as a;
OK
true	NULL
{code}

h4. 'b' is correctly reported as {{true}} when also inspecting 'b.c'.
{code}
hive>
    > select
    >   b is null,
    >   b.c is null,
    >   b
    > from (
    >   select
    >     if(1=1, null, named_struct('c', 1)) as b
    >   ) as a;
OK
true	true	NULL
{code}

h4. Persist the data to a table
{code}
hive>
    > create table a
    >   as
    > select
    >   if(1=1, null, named_struct('c', 1)) as b;
OK
{code}

h4. 'b' is incorrectly reported as {{false}}.
{code}
hive>
    > select
    >   b is null,
    >   b
    > from a;
OK
false	NULL
{code}

h4. 'b' is now correctly reported as {{true}} when also inspecting 'b.c'.
{code}
hive>
    > select
    >   b is null,
    >   b.c is null,
    >   b
    > from a;
OK
true	true	NULL
{code}

  was:
h2. Overview

Performing {{IS NULL}} checks against a null struct that is generated as part of a UDF correctly
returns {{true}}. However, the same check against the same null struct that has been persisted
to a table incorrectly returns {{false}}. Additionally, when a child field of the null struct
is inspected in the same query, the result of the null check on the struct itself reverses
itself to {{true}}.

The issue does not appear to be dependent on the storage format of the table as the same result
is repeated with TEXTFILE, PARQUET, ORC and AVRO.

h2. Example

In this example I have used {{if(1=1, null, named_struct('c', 1)}} as a crude method of generating
a simple null struct.

h4. 'b' is correctly reported as {{true}}.
{code}
hive> select
    >   b is null,
    >   b
    > from (
    >   select
    >     if(1=1, null, named_struct('c', 1)) as b
    >   ) as a;
OK
true	NULL
{code}

h4. 'b' is correctly reported as {{true}} when also inspecting 'b.c'.
{code}
hive>
    > select
    >   b is null,
    >   b.c is null,
    >   b
    > from (
    >   select
    >     if(1=1, null, named_struct('c', 1)) as b
    >   ) as a;
OK
true	true	NULL
{code}

h4. Persist the data to a table
{code}
hive>
    > create table a
    >   as
    > select
    >   if(1=1, null, named_struct('c', 1)) as b;
OK
{code}

h4. 'b' is incorrectly reported as {{false}}.
{code}
hive>
    > select
    >   b is null,
    >   b
    > from a;
OK
false	NULL
{code}

h4. 'b' is now correctly reported as {{true}} when also inspecting 'b.c'.
{code}
hive>
    > select
    >   b is null,
    >   b.c is null,
    >   b
    > from a;
OK
true	true	NULL
{code}


> Inconsistent/incorrect handling of NULL in nested structs
> ---------------------------------------------------------
>
>                 Key: HIVE-15328
>                 URL: https://issues.apache.org/jira/browse/HIVE-15328
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 2.1.0
>            Reporter: David Maughan
>
> h2. Overview
> Performing {{IS NULL}} checks against a null struct that is generated as part of a UDF
correctly returns {{true}}. However, the same check against the same null struct that has
been persisted to a table incorrectly returns {{false}}. Additionally, when a child field
of the null struct is inspected in the same query, the result of the null check on the struct
itself reverses itself to {{true}}.
> The issue does not appear to be dependent on the storage format of the table as the same
result is repeated with TEXTFILE, PARQUET, ORC and AVRO.
> h2. Example
> In this example I have used {{if(1=1, null, named_struct('c', 1))}} as a crude method
of generating a simple null struct.
> h4. 'b' is correctly reported as {{true}}.
> {code}
> hive> select
>     >   b is null,
>     >   b
>     > from (
>     >   select
>     >     if(1=1, null, named_struct('c', 1)) as b
>     >   ) as a;
> OK
> true	NULL
> {code}
> h4. 'b' is correctly reported as {{true}} when also inspecting 'b.c'.
> {code}
> hive>
>     > select
>     >   b is null,
>     >   b.c is null,
>     >   b
>     > from (
>     >   select
>     >     if(1=1, null, named_struct('c', 1)) as b
>     >   ) as a;
> OK
> true	true	NULL
> {code}
> h4. Persist the data to a table
> {code}
> hive>
>     > create table a
>     >   as
>     > select
>     >   if(1=1, null, named_struct('c', 1)) as b;
> OK
> {code}
> h4. 'b' is incorrectly reported as {{false}}.
> {code}
> hive>
>     > select
>     >   b is null,
>     >   b
>     > from a;
> OK
> false	NULL
> {code}
> h4. 'b' is now correctly reported as {{true}} when also inspecting 'b.c'.
> {code}
> hive>
>     > select
>     >   b is null,
>     >   b.c is null,
>     >   b
>     > from a;
> OK
> true	true	NULL
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message