hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <>
Subject [jira] [Resolved] (HIVE-16051) MM tables: skewjoin test fails
Date Wed, 01 Mar 2017 02:01:45 GMT


Sergey Shelukhin resolved HIVE-16051.
       Resolution: Fixed
    Fix Version/s: hive-14535

Pushed to branch. Config-driven skew join is disabled for MM tables due to FSOP commit limitations.
It could be addressed but is not very simple and the feature seems very obscure.

> MM tables: skewjoin test fails
> ------------------------------
>                 Key: HIVE-16051
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>             Fix For: hive-14535
> {noformat}
> set hive.optimize.skewjoin = true;
> set hive.skewjoin.key = 2;
> set hive.optimize.metadataonly=false;
> CREATE TABLE dest_j1(key INT, value STRING) STORED AS TEXTFILE tblproperties ("transactional"="true",
> FROM src src1 JOIN src src2 ON (src1.key = src2.key)
> INSERT OVERWRITE TABLE dest_j1 SELECT src1.key, src2.value;
> select count(distinct key) from dest_j1;
> {noformat}
> Different results for MM and non-MM table.
> Probably has something to do with how skewjoin handles files; however, looking at MM/debugging
logs, there are no suspicious deletes, and everything looks the same for both cases; all the
logging for skewjoin row containers and stuff is identical between the two runs (except for
the numbers/guids; the number of files, paths, etc. are all the same). So not sure what's
going on. Probably dfs dump can answer this question, but it doesn't work for me currently
on q files.

This message was sent by Atlassian JIRA

View raw message