From Horváth Péter Gergely <>
Subject Spark2 DataFrameWriter.saveAsTable defaults to external table if path is provided
Date Wed, 13 Feb 2019 11:37:42 GMT
Dear All,

I am facing a strange issue with Spark 2.3, where I would like to create a
MANAGED table out of the content of a DataFrame with the storage path

Apparently, when one tries to create a Hive table via
DataFrameWriter.saveAsTable, supplying a "path" option causes Spark to
automatically create an external table.

This demonstrates the behaviour:

scala> val numbersDF = sc.parallelize((1 to 100).toList).toDF("numbers")
numbersDF: org.apache.spark.sql.DataFrame = [numbers: int]

scala> numbersDF.write.format("orc").saveAsTable("numbers_table1")

scala> spark.sql("describe formatted
numbers_table1").filter(_.get(0).toString == "Type").show
|    Type|  MANAGED|       |

scala> numbersDF.write.format("orc").option("path",

scala> spark.sql("describe formatted
numbers_table2").filter(_.get(0).toString == "Type").show
|    Type| EXTERNAL|       |

I am wondering if there is any way to force creation of a managed table
with a custom path (which as far as I know, should be possible via standard
Hive commands).

I often seem to have the problem that I cannot find the appropriate
documentation for the option configuration of Spark APIs. Could someone
please point me to the right direction and tell me where these things are


