hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-25410) CommonMergeJoinOperator fails when a join key is ARRAY with arbitrary size
Date Fri, 30 Jul 2021 18:20:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-25410?focusedWorklogId=631821&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631821
]

ASF GitHub Bot logged work on HIVE-25410:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Jul/21 18:19
            Start Date: 30/Jul/21 18:19
    Worklog Time Spent: 10m 
      Work Description: okumin commented on a change in pull request #2551:
URL: https://github.com/apache/hive/pull/2551#discussion_r680138295



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/WritableComparatorFactory.java
##########
@@ -19,15 +19,14 @@
 
 import org.apache.hadoop.hive.ql.util.NullOrdering;
 import org.apache.hadoop.hive.serde2.objectinspector.StandardUnionObjectInspector.StandardUnion;
-import org.apache.hadoop.io.WritableComparable;
 import org.apache.hadoop.io.WritableComparator;
 import java.util.List;
 import java.util.Map;
 
 public final class WritableComparatorFactory {
     public static WritableComparator get(Object key, boolean nullSafe, NullOrdering nullOrdering)
{
         if (key instanceof List) {
-            // For array type struct is used as we do not know if all elements of array are
of same type.
+            // STRUCT or ARRAY are expressed as java.util.List
             return new HiveStructComparator(nullSafe, nullOrdering);

Review comment:
       In my understanding, all elements of ARRAY should have the same type.
   - https://github.com/apache/hive/blob/eef2a5dda0470525d0d89bd9820e761fe44dc3a8/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardListObjectInspector.java#L34
   - https://github.com/apache/hive/blob/eef2a5dda0470525d0d89bd9820e761fe44dc3a8/serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/ListTypeInfo.java#L39
   
   Though UNION can have multiple types, it's expressed as a special type. Currently, it is
not supported as below.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 631821)
    Time Spent: 0.5h  (was: 20m)

> CommonMergeJoinOperator fails when a join key is ARRAY with arbitrary size
> --------------------------------------------------------------------------
>
>                 Key: HIVE-25410
>                 URL: https://issues.apache.org/jira/browse/HIVE-25410
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: okumin
>            Assignee: okumin
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Thanks to HIVE-24883, CommonMergeJoinOperator can handle ARRAY or STRUCT types as a
JOIN key.
> There are corner cases where CommonMergeJoinOperator fails with `ArrayIndexOutOfBoundsException`.
>  
> This is a simple case.
> {code:java}
> SET hive.auto.convert.join=false;
> CREATE TABLE table_list_types (id int, key array<int>);
> INSERT INTO table_list_types VALUES (1, array(1, 2)), (2, array(1, 2)), (3, array(1,
2, 3)), (4, array(1, 2, 3));
> SELECT * FROM table_list_types t1 INNER JOIN table_list_types t2 ON t1.key = t2.key;
{code}
> With 69c97c26ac68a245f4d327cc2f7b3a2333f8fa84, the following error happened.
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
> 	at org.apache.hadoop.hive.ql.exec.HiveStructComparator.compare(HiveStructComparator.java:57)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.compareKey(CommonMergeJoinOperator.java:629)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.compareKeys(CommonMergeJoinOperator.java:597)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processKey(CommonMergeJoinOperator.java:566)
> 	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:249)
> 	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
> 	... 26 more {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message