hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-25410) CommonMergeJoinOperator fails when a join key is ARRAY with arbitrary size
Date Fri, 30 Jul 2021 18:15:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-25410?focusedWorklogId=631818&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631818
]

ASF GitHub Bot logged work on HIVE-25410:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Jul/21 18:14
            Start Date: 30/Jul/21 18:14
    Worklog Time Spent: 10m 
      Work Description: okumin commented on a change in pull request #2551:
URL: https://github.com/apache/hive/pull/2551#discussion_r680135993



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/HiveStructComparator.java
##########
@@ -45,16 +48,14 @@ public int compare(Object key1, Object key2) {
         if (a1.size() == 0) {
             return 0;
         }
-        if (comparator == null) {

Review comment:
       The root cause is that the num of comparators is fixed to the size of the first List.
This will work for STRUCT since the number of elements is consistent across records. However,
in the case of ARRAY, it varies.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 631818)
    Time Spent: 20m  (was: 10m)

> CommonMergeJoinOperator fails when a join key is ARRAY with arbitrary size
> --------------------------------------------------------------------------
>
>                 Key: HIVE-25410
>                 URL: https://issues.apache.org/jira/browse/HIVE-25410
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: okumin
>            Assignee: okumin
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Thanks to HIVE-24883, CommonMergeJoinOperator can handle ARRAY or STRUCT types as a
JOIN key.
> There are corner cases where CommonMergeJoinOperator fails with `ArrayIndexOutOfBoundsException`.
>  
> This is a simple case.
> {code:java}
> SET hive.auto.convert.join=false;
> CREATE TABLE table_list_types (id int, key array<int>);
> INSERT INTO table_list_types VALUES (1, array(1, 2)), (2, array(1, 2)), (3, array(1,
2, 3)), (4, array(1, 2, 3));
> SELECT * FROM table_list_types t1 INNER JOIN table_list_types t2 ON t1.key = t2.key;
{code}
> With 69c97c26ac68a245f4d327cc2f7b3a2333f8fa84, the following error happened.
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
> 	at org.apache.hadoop.hive.ql.exec.HiveStructComparator.compare(HiveStructComparator.java:57)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.compareKey(CommonMergeJoinOperator.java:629)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.compareKeys(CommonMergeJoinOperator.java:597)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processKey(CommonMergeJoinOperator.java:566)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:249)
> 	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
> 	... 26 more {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message