hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Feng Yuan (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-13781) Tez Job failed with FileNotFoundException when partition dir doesnt exists
Date Thu, 19 May 2016 07:31:12 GMT

    [ https://issues.apache.org/jira/browse/HIVE-13781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15290604#comment-15290604
] 

Feng Yuan edited comment on HIVE-13781 at 5/19/16 7:30 AM:
-----------------------------------------------------------

Hi [~ashutoshc],[~prasanth_j],[~vikram.dixit],[~sershe],[~gopalv]when in mr,this issue work
well,should tez complete this feature?
detail:
When the metadata partition information and storage directory is divided(some dir doesnt exists.)
mr will go through this issue.I mean since hive 2.0 recommend tez why not we build it more
compatible for our bussiness work?


was (Author: feng yuan):
Hi [~ashutoshc],when in mr,this issue work well,should tez complete this feature?
detail:
When the metadata partition information and storage directory is divided(some dir doesnt exists.)
mr will go through this issue.I mean since hive 2.0 recommend tez why not we build it more
compatible for our bussiness work?

> Tez Job failed with FileNotFoundException when partition dir doesnt exists 
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-13781
>                 URL: https://issues.apache.org/jira/browse/HIVE-13781
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>    Affects Versions: 0.14.0, 2.0.0
>         Environment: hive 0.14.0 ,tez-0.5.2,hadoop 2.6.0
>            Reporter: Feng Yuan
>
> when i have a partitioned table a with partition "day",in metadata a have partition day:
20160501,20160502,but partition 20160501's dir didnt exits.
> so when i use tez engine to run hive -e "select day,count(*) from a where xx=xx group
by day"
> hive throws FileNotFoundException.
> but mr work.
> repo eg:
> CREATE EXTERNAL TABLE `a`(
>   `a` string)
> PARTITIONED BY ( 
>   `l_date` string);
> insert overwrite table a partition(l_date='2016-04-08') values (1),(2);
> insert overwrite table a partition(l_date='2016-04-09') values (1),(2);
> hadoop dfs -rm -r -f /warehouse/a/l_date=2016-04-09
> select l_date,count(*) from a where a='1' group by l_date;
> error:
> ut: a initializer failed, vertex=vertex_1463493135662_10445_1_00 [Map 1], org.apache.hadoop.mapred.InvalidInputException:
Input path does not exist: hdfs://bfdhadoopcool/warehouse/test.db/a/l_date=2015-04-09
> 	at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285)
> 	at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
> 	at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
> 	at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:300)
> 	at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:402)
> 	at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:129)
> 	at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)
> 	at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> 	at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)
> 	at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message