hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-25410) CommonMergeJoinOperator fails when a join key is ARRAY with arbitrary size
Date Fri, 30 Jul 2021 18:10:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-25410?focusedWorklogId=631817&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631817
]

ASF GitHub Bot logged work on HIVE-25410:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Jul/21 18:09
            Start Date: 30/Jul/21 18:09
    Worklog Time Spent: 10m 
      Work Description: okumin opened a new pull request #2551:
URL: https://github.com/apache/hive/pull/2551


   ### What changes were proposed in this pull request?
   
   As commented in the following ticket, the current implementation can't perform JOIN on
a variant-sized ARRAY.
   This PR will let CommonMergeJoinOperator handle any ARRAY objects as JOIN keys.
   https://issues.apache.org/jira/browse/HIVE-25410
   
   ### Why are the changes needed?
   
   We have no reasons to allow only fixed-length ARRAYs as JOIN keys. We can regard this issue
as a kind of bug.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. This case should have been covered in HIVE-24883.
   
   ### How was this patch tested?
   
   - I've modified two test cases
   - I've tested the query to reproduce the issue in the ticket


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 631817)
    Remaining Estimate: 0h
            Time Spent: 10m

> CommonMergeJoinOperator fails when a join key is ARRAY with arbitrary size
> --------------------------------------------------------------------------
>
>                 Key: HIVE-25410
>                 URL: https://issues.apache.org/jira/browse/HIVE-25410
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: okumin
>            Assignee: okumin
>            Priority: Major
>             Fix For: 4.0.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Thanks to HIVE-24883, CommonMergeJoinOperator can handle ARRAY or STRUCT types as a
JOIN key.
> There are corner cases where CommonMergeJoinOperator fails with `ArrayIndexOutOfBoundsException`.
>  
> This is a simple case.
> {code:java}
> SET hive.auto.convert.join=false;
> CREATE TABLE table_list_types (id int, key array<int>);
> INSERT INTO table_list_types VALUES (1, array(1, 2)), (2, array(1, 2)), (3, array(1,
2, 3)), (4, array(1, 2, 3));
> SELECT * FROM table_list_types t1 INNER JOIN table_list_types t2 ON t1.key = t2.key;
{code}
> With 69c97c26ac68a245f4d327cc2f7b3a2333f8fa84, the following error happened.
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
> 	at org.apache.hadoop.hive.ql.exec.HiveStructComparator.compare(HiveStructComparator.java:57)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.compareKey(CommonMergeJoinOperator.java:629)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.compareKeys(CommonMergeJoinOperator.java:597)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processKey(CommonMergeJoinOperator.java:566)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:249)
> 	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
> 	... 26 more {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message