spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nirav Patel <npa...@xactlycorp.com>
Subject Insert into dynamic partitioned hive/parquet table throws error - Partition spec contains non-partition columns
Date Fri, 03 Aug 2018 00:01:42 GMT
I am trying to insert overwrite multiple partitions into existing
partitioned hive/parquet table. Table was created using sparkSession.

I have a table 'mytable' with partitions P1 and P2.

I have following set on sparkSession object:

    .config("hive.exec.dynamic.partition", true)

    .config("hive.exec.dynamic.partition.mode", "nonstrict")

val df = spark.read.csv(pathToNewData)

df.createOrReplaceTempView("updateTable")

here 'df' may contains data from multiple partitions. i.e. multiple values
for P1 and P2 in data.


spark.sql("insert overwrite table mytable PARTITION(P1, P2) select c1,
c2,..cn, P1, P2 from updateTable") // I made sure that partition columns P1
and P2 are at the end of projection list.

I am getting following error:

org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.
metadata.Table.ValidationFailureSemanticException: Partition spec {p1=,
p2=, P1=1085, P2=164590861} contains non-partition columns;

dataframe 'df' have records for P1=1085, P2=164590861 .

-- 


 <http://www.xactlycorp.com/email-click/>

 
<https://www.instagram.com/xactlycorp/>   
<https://www.linkedin.com/company/xactly-corporation>   
<https://twitter.com/Xactly>   <https://www.facebook.com/XactlyCorp>   
<http://www.youtube.com/xactlycorporation>

Mime
View raw message