spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhinesh Hada <abhinesh...@gmail.com>
Subject [Spark SQL]: Does Union operation followed by drop duplicate follows "keep first"
Date Fri, 13 Sep 2019 15:43:28 GMT
Hi,

I am trying to take union of 2 dataframes and then drop duplicate based on
the value of a specific column. But, I want to make sure that while
dropping duplicates, the rows from first data frame are kept.

Example:
df1 = df1.union(df2).dropDuplicates(['id'])

Mime
View raw message