hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-20456) Query fails with FNFException using MR with skewjoin enabled and auto convert join disabled
Date Fri, 24 Aug 2018 16:54:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-20456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591909#comment-16591909
] 

Hive QA commented on HIVE-20456:
--------------------------------

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 22s{color}
| {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  7s{color} |
{color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 40s{color}
| {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  7s{color} | {color:blue}
ql in master has 2308 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 57s{color} |
{color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 26s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  6s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  6s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 40s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 15s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 59s{color} |
{color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 13s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 24s{color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03)
x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-13448/dev-support/hive-personality.sh
|
| git revision | master / 6a28265 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-13448/yetus.txt |
| Powered by | Apache Yetus    http://yetus.apache.org |


This message was automatically generated.



> Query fails with FNFException using MR with skewjoin enabled and auto convert join disabled
> -------------------------------------------------------------------------------------------
>
>                 Key: HIVE-20456
>                 URL: https://issues.apache.org/jira/browse/HIVE-20456
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 1.2.0, 2.1.1, 3.1.0
>            Reporter: Aditya Shah
>            Assignee: Aditya Shah
>            Priority: Major
>         Attachments: HIVE-20456.patch
>
>
> When skew join is enabled and auto convert join is disabled the query fails with file
not found exception. The following query reproduces the error:
>  
> {code:java}
> set hive.optimize.skewjoin = true;
> set hive.auto.convert.join = false;
> set hive.groupby.orderby.position.alias = true;
> set hive.on.master=true;
> set hive.execution.engine=mr;
> drop database if exists test cascade;
> create database if not exists test;
> use test;
> CREATE EXTERNAL TABLE test_table1
> ( `a` int , `b` int, `c` int)
> PARTITIONED BY (
> `d` int)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> ;
> CREATE EXTERNAL TABLE test_table2
> ( `a` int , `b` int, `c` int)
> PARTITIONED BY (
> `d` int)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';
> CREATE EXTERNAL TABLE test_table3
> ( `a` int , `b` int, `c` int)
> PARTITIONED BY (
> `e` int)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> WITH SERDEPROPERTIES (
> 'field.delim'='\u0001',
> 'serialization.format'='\u0001')
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';
> CREATE EXTERNAL TABLE test_table4 (`a` int , `b` int, `c` int)
> PARTITIONED BY (
> `e` string)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> WITH SERDEPROPERTIES (
> 'field.delim'='\u0001',
> 'serialization.format'='\u0001')
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';
> with
> temp1 as (
> select
> g.a,
> n.b,
> u.c
> from
> test_table2 g
> inner join test_table4 u on g.a = u.a
> inner join test_table3 n on u.b = n.b
> ),
> temp2 as (
> select * from test_table4 where a = 2
> ),
> temp21 as (
> select
> g.b,
> n.c,
> u.a
> from
> temp2 g
> inner join test_table3 u on g.b = u.b
> inner join test_table2 n on u.c = n.c
> group by g.b, n.c, u.a
> ),
> stack as (
> select * from temp1
> union all
> select * from temp21
> )
> select * from stack;
> {code}
> The query runs perfectly fine when tez is used or other combinations of skew join and
auto convert join are set. On diagnosing the issue, the problem was when a conditional task
resolves tasks it puts the resolved task directly in the runnable state without checking the
parental dependencies as well as whether the task is already queued.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message