spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dávid Szakállas <david.szakal...@gmail.com>
Subject Support nested keys in DataFrameWriter.bucketBy
Date Mon, 15 Oct 2018 13:58:50 GMT
Currently (In Spark 2.3.1) we cannot bucket DataFrames by nested columns, e.g 

df.write.bucketBy(10, "key.a").saveAsTable(“junk”)

will result in the following exception:

org.apache.spark.sql.AnalysisException: bucket column key.a is not defined in table junk,
defined table columns are: key, value;
	at org.apache.spark.sql.catalyst.catalog.CatalogUtils$$anonfun$org$apache$spark$sql$catalyst$catalog$CatalogUtils$$normalizeColumnName$2.apply(ExternalCatalogUtils.scala:246)
	at org.apache.spark.sql.catalyst.catalog.CatalogUtils$$anonfun$org$apache$spark$sql$catalyst$catalog$CatalogUtils$$normalizeColumnName$2.apply(ExternalCatalogUtils.scala:246)
	at scala.Option.getOrElse(Option.scala:121)
	…

Are there plans to change this anytime soon?

Thanks, David





---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message