spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Xin <richardxin...@yahoo.com.INVALID>
Subject is partitionBy of DataFrameWriter supported in 1.6.x?
Date Thu, 19 Jan 2017 05:44:06 GMT
I found contradictions in document 1.6.0 and 2.1.x
in http://spark.apache.org/docs/1.6.0/api/scala/index.html#org.apache.spark.sql.DataFrameWriterit
says: "This is only applicable for Parquet at the moment."
in http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameWriterit
says: "This was initially applicable for Parquet but in 1.5+ covers JSON, text, ORC and avro
as well."
and I got warning when trying to save in scala:
> df.write.mode("overwrite").format("orc").partitionBy("date").saveAsTable("test.my_test")
17/01/19 13:34:43 WARN hive.HiveContext$$anon$2: Persisting partitioned data source relation
`test`.`my_test` into Hive metastore in Spark SQL specific format, which is NOT compatible
with Hive. Input path(s): 
hdfs://nameservice1/user/hive/warehouse/test.db/my_test
looking at hdfs directories, the folders are there, but not selectable on both presto and
hive. 

any comments?
Thanks.

Mime
View raw message