spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yi Tian <tianyi.asiai...@gmail.com>
Subject Re: Exception while select into table.
Date Tue, 03 Mar 2015 10:04:23 GMT
Sorry, I make a mistake in last mail.
In your case, sparkSQL will use

|hdfs://longzhou-hdpnn.lz.dscc:11000/tmp/hive-hadoop/hive_2015-03-03_13-41-04_472_3573658402424030395-1/-ext-10000
|

as a temporary path to hold the result of

|select * from bak_startup_log_uid_20150227 where login_time < 1425027600
|

then load the files in this temporary path to table 
|startup_log_uid_20150227|
I’m not quite sure about why is there a directory named 
|attempt_201503031341_0057_m_003375_21951| in your case.
I guess the task |attempt_201503031341_0057_m_003375_21951| met some 
problem while running.
Could you find some output like

|15/03/03 17:27:24 INFO output.FileOutputCommitter: Saved output of task ‘attempt_201503031727_0001_m_000000_0’
to hdfs://TimMacBook:8020/tmp/hive-tianyi/hive_2015-03-03_17-27-23_675_5406986091401112367-1/-ext-10000/_temporary/0/task_201503031727_0001_m_000000
|

in your log file?

On 3/3/15 16:21, LinQili wrote:

> Hi Yi,
> Thanks for your reply.
> 1. The version of spark is 1.2.0 and the version of hive is 
> 0.10.0-cdh4.2.1.
> 2. The full trace stack of the exception:
> 15/03/03 13:41:30 INFO Client:
>      client token: 
> DAAAAUrrav1rAADCnhQzX_Ic6CMnfqcW2NIxra5n8824CRFZQVJOX0NMSUVOVF9UT0tFTgA
>      diagnostics: User class threw exception: checkPaths: 
> hdfs://longzhou-hdpnn.lz.dscc:11000/tmp/hive-hadoop/hive_2015-03-03_13-41-04_472_3573658402424030395-1/-ext-10000

> has nested 
> directoryhdfs://longzhou-hdpnn.lz.dscc:11000/tmp/hive-hadoop/hive_2015-03-03_13-41-04_472_3573658402424030395-1/-ext-10000/attempt_201503031341_0057_m_003375_21951

>
>      ApplicationMaster host: longzhou-hdp4.lz.dscc
>      ApplicationMaster RPC port: 0
>      queue: dt_spark
>      start time: 1425361063973
>      final status: FAILED
>      tracking URL: 
> longzhou-hdpnn.lz.dscc:12080/proxy/application_1421288865131_49822/history/application_1421288865131_49822
>      user: dt
> Exception in thread "main" org.apache.spark.SparkException: 
> Application finished with failed status
>     at 
> org.apache.spark.deploy.yarn.ClientBase$class.run(ClientBase.scala:504)
>     at org.apache.spark.deploy.yarn.Client.run(Client.scala:39)
>     at org.apache.spark.deploy.yarn.Client$.main(Client.scala:143)
>     at org.apache.spark.deploy.yarn.Client.main(Client.scala)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:601)
>     at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> It seems that it is you are right about the causes.
> Still,  I am confused that why the nested directory is 
> `hdfs://longzhou-hdpnn.lz.dscc:11000/tmp/hive-hadoop/hive_2015-03-03_13-41-04_472_3573658402424030395-1/-ext-10000/attempt_201503031341_0057_m_003375_21951`

> but not the path which |bak_startup_log_uid_20150227| point to? What's 
> in the `/tmp/hive-hadoop` ? What are they used for? It seems that 
> there are a huge lot of files in this directory.
> Thanks.
>
> On 2015年03月03日 14:43, Yi Tian wrote:
>>
>> Hi,
>> Some suggestions:
>> 1 You should tell us the version of spark and hive you are using.
>> 2 You shoul paste the full trace stack of the exception.
>>
>> In this case, I guess you have a nested directory in the path which 
>> |bak_startup_log_uid_20150227| point to.
>>
>> and the config field |hive.mapred.supports.subdirectories| is |false| 
>> by default.
>>
>> so…
>>
>> |if (!conf.getBoolVar(HiveConf.ConfVars.HIVE_HADOOP_SUPPORTS_SUBDIRECTORIES) &&
>>              item.isDir()) {
>>              throw new HiveException("checkPaths: " + src.getPath()
>>                  + " has nested directory" + itemSource);
>>            }
>> |
>>
>> On 3/3/15 14:36, LinQili wrote:
>>
>>> Hi all,
>>> I was doing select using spark sql like:
>>>
>>> insert into table startup_log_uid_20150227
>>> select * from bak_startup_log_uid_20150227
>>> where login_time < 1425027600
>>>
>>> Usually, it got a exception:
>>>
>>> org.apache.hadoop.hive.ql.metadata.Hive.checkPaths(Hive.java:2157)
>>> org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:2298)
>>> org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:686)
>>> org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1469)
>>> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:243)
>>> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:137)
>>> org.apache.spark.sql.execution.Command$class.execute(commands.scala:46)
>>> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.execute(InsertIntoHiveTable.scala:51)
>>> org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
>>> org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
>>> org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
>>> org.apache.spark.sql.SchemaRDD.<init>(SchemaRDD.scala:108)
>>> org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94)
>>> com.nd.home99.LogsProcess$anonfun$main$1$anonfun$apply$1.apply(LogsProcess.scala:286)
>>> com.nd.home99.LogsProcess$anonfun$main$1$anonfun$apply$1.apply(LogsProcess.scala:83)
>>> scala.collection.immutable.List.foreach(List.scala:318)
>>> com.nd.home99.LogsProcess$anonfun$main$1.apply(LogsProcess.scala:83)
>>> com.nd.home99.LogsProcess$anonfun$main$1.apply(LogsProcess.scala:82)
>>> scala.collection.immutable.List.foreach(List.scala:318)
>>> com.nd.home99.LogsProcess$.main(LogsProcess.scala:82)
>>> com.nd.home99.LogsProcess.main(LogsProcess.scala)
>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> java.lang.reflect.Method.invoke(Method.java:601)
>>> org.apache.spark.deploy.yarn.ApplicationMaster$anon$2.run(ApplicationMaster.scala:427)
>>>
>>> Is there any hints about this?
>> ​
>
​
​
​

​

​

Mime
View raw message