spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vamshi Talla <>
Subject Re: How to avoid duplicate column names after join with multiple conditions
Date Mon, 09 Jul 2018 02:39:43 GMT

Spark does not create a duplicate column when you use the below join expression,  as an array
of column(s) like below but that requires the column name to be same in both the data frames.

Example: df1.join(df2, [‘a’])

Vamshi Talla

On Jul 6, 2018, at 4:47 PM, Gokula Krishnan D <<>>


withColumnRenamed() API might help but it does not different column and renames all the occurrences
of the given column. either use select() API and rename as you want.

Thanks & Regards,
Gokula Krishnan (Gokul)

On Mon, Jul 2, 2018 at 5:52 PM, Nirav Patel <<>>
Expr is `df1(a) === df2(a) and df1(b) === df2(c)`

How to avoid duplicate column 'a' in result? I don't see any api that combines both. Rename

[What's New with Xactly]<>

 [] <>
  [] <>
  [] <>
  [] <>

View raw message