spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sudhakar Thota (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-8333) Spark failed to delete temp directory created by HiveContext
Date Sat, 01 Aug 2015 23:47:04 GMT

    [ https://issues.apache.org/jira/browse/SPARK-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650555#comment-14650555
] 

Sudhakar Thota commented on SPARK-8333:
---------------------------------------

Thanks for the clarification. I have used the same statement you suggested to create HiveContext
and was able to stop the sc without issues. After stopping without issues, I have validated
by trying to use sqlContext as well as SparkContext. Please let me know if this is happening
if you run it using a script and not from REPL.
Please take a look.

-----------------
1. Creating HiveContext using hive, creating table, calling “sc.stop()” , verifying by
trying to create a table again.

Sudhakars-MacBook-Pro-2:spark-1.4.0 sudhakarthota$ bin/spark-shell
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.4.0
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_45)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc.
SQL context available as sqlContext.

scala> val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@5ac35b17

scala> sqlContext.sql("CREATE TABLE IF NOT EXISTS test1 (name STRING, rank INT) ROW FORMAT
DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n'")
res0: org.apache.spark.sql.DataFrame = [result: string]

scala> sc.stop()

scala> sqlContext.sql("CREATE TABLE IF NOT EXISTS test2 (name STRING, rank INT) ROW FORMAT
DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n'")
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
	at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103)
	at org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:696)
	at org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:695)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
	at org.apache.spark.SparkContext.withScope(SparkContext.scala:681)
	at org.apache.spark.SparkContext.parallelize(SparkContext.scala:695)
	at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:88)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:88)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:87)
	at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:939)
	at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:939)
	at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:144)
	at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:128)
	at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:744)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:24)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:31)
	at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33)
	at $iwC$$iwC$$iwC$$iwC.<init>(<console>:35)
	at $iwC$$iwC$$iwC.<init>(<console>:37)
	at $iwC$$iwC.<init>(<console>:39)
	at $iwC.<init>(<console>:41)
	at <init>(<console>:43)
	at .<init>(<console>:47)
	at .<clinit>(<console>)
	at .<init>(<console>:7)
	at .<clinit>(<console>)
	at $print(<console>)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
	at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
	at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
	at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
	at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
	at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
	at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
	at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
	at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
	at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
	at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
	at org.apache.spark.repl.Main$.main(Main.scala:31)
	at org.apache.spark.repl.Main.main(Main.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

scala> val fl1 = sc.textFile("/Users/sudhakarthota/spark/README.md")
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
	at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103)
	at org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1912)
	at org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1925)
	at org.apache.spark.SparkContext.textFile$default$2(SparkContext.scala:797)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:21)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:26)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:28)
	at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30)
	at $iwC$$iwC$$iwC$$iwC.<init>(<console>:32)
	at $iwC$$iwC$$iwC.<init>(<console>:34)
	at $iwC$$iwC.<init>(<console>:36)
	at $iwC.<init>(<console>:38)
	at <init>(<console>:40)
	at .<init>(<console>:44)
	at .<clinit>(<console>)
	at .<init>(<console>:7)
	at .<clinit>(<console>)
	at $print(<console>)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
	at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
	at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
	at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
	at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
	at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
	at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
	at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
	at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
	at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
	at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
	at org.apache.spark.repl.Main$.main(Main.scala:31)
	at org.apache.spark.repl.Main.main(Main.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


scala> 
----------------------

> Spark failed to delete temp directory created by HiveContext
> ------------------------------------------------------------
>
>                 Key: SPARK-8333
>                 URL: https://issues.apache.org/jira/browse/SPARK-8333
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.0
>         Environment: Windows7 64bit
>            Reporter: sheng
>            Priority: Minor
>              Labels: Hive, metastore, sparksql
>
> Spark 1.4.0 failed to stop SparkContext.
> {code:title=LocalHiveTest.scala|borderStyle=solid}
>  val sc = new SparkContext("local", "local-hive-test", new SparkConf())
>  val hc = Utils.createHiveContext(sc)
>  ... // execute some HiveQL statements
>  sc.stop()
> {code}
> sc.stop() failed to execute, it threw the following exception:
> {quote}
> 15/06/13 03:19:06 INFO Utils: Shutdown hook called
> 15/06/13 03:19:06 INFO Utils: Deleting directory C:\Users\moshangcheng\AppData\Local\Temp\spark-d6d3c30e-512e-4693-a436-485e2af4baea
> 15/06/13 03:19:06 ERROR Utils: Exception while deleting Spark temp dir: C:\Users\moshangcheng\AppData\Local\Temp\spark-d6d3c30e-512e-4693-a436-485e2af4baea
> java.io.IOException: Failed to delete: C:\Users\moshangcheng\AppData\Local\Temp\spark-d6d3c30e-512e-4693-a436-485e2af4baea
> 	at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:963)
> 	at org.apache.spark.util.Utils$$anonfun$1$$anonfun$apply$mcV$sp$5.apply(Utils.scala:204)
> 	at org.apache.spark.util.Utils$$anonfun$1$$anonfun$apply$mcV$sp$5.apply(Utils.scala:201)
> 	at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
> 	at org.apache.spark.util.Utils$$anonfun$1.apply$mcV$sp(Utils.scala:201)
> 	at org.apache.spark.util.SparkShutdownHook.run(Utils.scala:2292)
> 	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Utils.scala:2262)
> 	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(Utils.scala:2262)
> 	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(Utils.scala:2262)
> 	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772)
> 	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(Utils.scala:2262)
> 	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(Utils.scala:2262)
> 	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(Utils.scala:2262)
> 	at scala.util.Try$.apply(Try.scala:161)
> 	at org.apache.spark.util.SparkShutdownHookManager.runAll(Utils.scala:2262)
> 	at org.apache.spark.util.SparkShutdownHookManager$$anon$6.run(Utils.scala:2244)
> 	at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> {quote}
> It seems this bug is introduced by this SPARK-6907. In SPARK-6907, a local hive metastore
is created in a temp directory. The problem is the local hive metastore is not shut down correctly.
At the end of application,  if SparkContext.stop() is called, it tries to delete the temp
directory which is still used by the local hive metastore, and throws an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message