spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Jung <itsjb.j...@samsung.com>
Subject Re: Drop table and Hive warehouse
Date Tue, 25 Aug 2015 04:53:30 GMT
Thanks, Michael.
I discovered it myself. Finally, it was not a bug from Spark. 
I have two HDFS cluster and Hive uses hive.metastore.warehouse.dir + fs.defaultFS(HDFS1) for
saving internal tables and also reference a default database URI(HDFS2) in "DBS" table from
metastore.
It may not be a problem if URI of default database is same as fs.defaultFS.
Maybe few of people set their default database URI to another HDFS like me.
I copied hive-site.xml into spark conf then Hive and Spark had same metastore configuration.
But the result table of "saveAsTable" has its metadata in HDFS1 and its data in HDFS2.
"DESCRIBE FORMATTED <table_name>" will show the difference between Location of table(HDFS1)
and Path in Storage Desc Params(HDFS2) even though table is type of MANAGED_TABLE.
That is why "DROP TABLE" deletes only metadata in HDFS1 and NOT delete data files in HDFS2.
So I can not reproduce a table with same location and same name. If I update DBS table in
metastoredb to set default database URI to HDFS1, it works perfectly.


Kevin

------- Original Message -------
Sender : Michael Armbrust<michael@databricks.com>
Date : 2015-08-25 00:43 (GMT+09:00)
Title : Re: Drop table and Hive warehouse

Thats not the expected behavior.  What version of Spark?


On Mon, Aug 24, 2015 at 1:32 AM, Kevin Jung <itsjb.jung@samsung.com> wrote:

When I store DataFrame as table with command "saveAsTable" and then execute "DROP TABLE" in
SparkSQL, it doesn't actually delete files in hive warehouse.
The table disappears from a table list but the data files are still alive.
Because of this, I can't saveAsTable with a same name before dropping table.
Is it a normal situation? If it is, I will delete files manually ;)

Kevin




상기 메일은 지정된 수신인만을 위한 것이며, 부정경쟁방지 및 영업비밀보호에
관한 법률,개인정보 보호법을 포함하여
 관련 법령에 따라 보호의 대상이 되는 영업비밀, 산업기술,기밀정보,
개인정보 등을 포함하고 있을 수 있습니다.
본 문서에 포함된 정보의 전부 또는 일부를 무단으로 복사 또는 사용하거나
제3자에게 공개, 배포, 제공하는 것은 엄격히
 금지됩니다. 본 메일이 잘못 전송된 경우 발신인 또는 당사에게 알려주시고
본 메일을 즉시 삭제하여 주시기 바랍니다. 
The contents of this e-mail message and any attachments are confidential and are intended
solely for addressee.
 The information may also be legally privileged. This transmission is sent in trust, for the
sole purpose of delivery
 to the intended recipient. If you have received this transmission in error, any use, reproduction
or dissemination of
 this transmission is strictly prohibited. If you are not the intended recipient, please immediately
notify the sender
 by reply e-mail or phone and delete this message and its attachments, if any.
Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message