hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15434) Add UDF to allow interrogation of uniontype values
Date Fri, 16 Dec 2016 11:22:58 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754162#comment-15754162
] 

Hive QA commented on HIVE-15434:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12843576/HIVE-15434.02.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10825 tests executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234)
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=128)
	[stats13.q,router_join_ppr.q,auto_join13.q,vector_mapjoin_reduce.q,ptf_register_tblfn.q,join_merging.q,union_date_trim.q,groupby3_noskew.q,optimize_nullscan.q,join3.q,join38.q,skewjoinopt1.q,join_alt_syntax.q,groupby_sort_1_23.q,timestamp_udf.q]
TestVectorizedColumnReaderBase - did not produce a TEST-*.xml file (likely timed out) (batchId=251)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_functions] (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_sort_array] (batchId=59)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1] (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2607/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2607/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2607/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12843576 - PreCommit-HIVE-Build

> Add UDF to allow interrogation of uniontype values
> --------------------------------------------------
>
>                 Key: HIVE-15434
>                 URL: https://issues.apache.org/jira/browse/HIVE-15434
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>    Affects Versions: 2.1.1
>            Reporter: David Maughan
>            Assignee: David Maughan
>         Attachments: HIVE-15434.01.patch, HIVE-15434.02.patch
>
>
> h2. Overview
> As stated in the documention:
> {quote}
> UNIONTYPE support is incomplete The UNIONTYPE datatype was introduced in Hive 0.7.0 (HIVE-537),
but full support for this type in Hive remains incomplete. Queries that reference UNIONTYPE
fields in JOIN (HIVE-2508), WHERE, and GROUP BY clauses will fail, and Hive does not define
syntax to extract the tag or value fields of a UNIONTYPE. This means that UNIONTYPEs are effectively
look-at-only.
> {quote}
> It is essential to have a usable uniontype. Until full support is added to Hive users
should at least have the ability to inspect and extract values for further comparison or transformation.
> h2. Proposal
> I propose to add a GenericUDF that has 2 modes of operation. Consider the following schema
and data that contains a union:
> Schema:
> {code}
> struct<field1:uniontype<int,string>>
> {code}
> Query:
> {code}
> hive> select field1 from thing;
> {0:0}
> {1:"one"}
> {code}
> h4. Explode to Struct
> This method will recursively convert all unions within the type to structs with fields
named {{tag_n}}, {{n}} being the tag number. Only the {{tag_*}} field that matches the tag
of the union will be populated with the value. In the case above the schema of field1 will
be converted to:
> {code}
> struct<tag_0:int,tag_1:string>
> {code}
> {code}
> hive> select extract_union(field1) from thing;
> {"tag_0":0,"tag_1":null}
> {"tag_0":null,"tag_1":one}
> {code}
> {code}
> hive> select extract_union(field1).tag_0 from thing;
> 0
> null
> {code}
> h4. Extract the specified tag
> This method will simply extract the value of the specified tag. If the tag number matches
then the value is returned, if it does not, then null is returned.
> {code}
> hive> select extract_union(field1, 0) from thing;
> 0
> null
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message