spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prem Sure <sparksure...@gmail.com>
Subject Re: How to avoid duplicate column names after join with multiple conditions
Date Fri, 13 Jul 2018 01:40:18 GMT
Yes Nirav, we can probably request dev for a config param enablement to
take care of this automatically (internally) - additional care required
while specifying column names and joining from users

Thanks,
Prem

On Thu, Jul 12, 2018 at 10:53 PM Nirav Patel <npatel@xactlycorp.com> wrote:

> Hi Prem, dropping column, renaming column are working for me as a
> workaround. I thought it just nice to have generic api that can handle that
> for me. or some intelligence that since both columns are same it shouldn't
> complain in subsequent Select clause that it doesn't know if I mean a#12 or
> a#81. They are both same just pick one.
>
> On Thu, Jul 12, 2018 at 9:38 AM, Prem Sure <sparksure542@gmail.com> wrote:
>
>> Hi Nirav, did you try
>> .drop(df1(a) after join
>>
>> Thanks,
>> Prem
>>
>> On Thu, Jul 12, 2018 at 9:50 PM Nirav Patel <npatel@xactlycorp.com>
>> wrote:
>>
>>> Hi Vamshi,
>>>
>>> That api is very restricted and not generic enough. It imposes that all
>>> conditions of joins has to have same column on both side and it also has to
>>> be equijoin. It doesn't serve my usecase where some join predicates don't
>>> have same column names.
>>>
>>> Thanks
>>>
>>> On Sun, Jul 8, 2018 at 7:39 PM, Vamshi Talla <vamshi_t@hotmail.com>
>>> wrote:
>>>
>>>> Nirav,
>>>>
>>>> Spark does not create a duplicate column when you use the below join
>>>> expression,  as an array of column(s) like below but that requires the
>>>> column name to be same in both the data frames.
>>>>
>>>> Example: *df1.join(df2, [‘a’])*
>>>>
>>>> Thanks.
>>>> Vamshi Talla
>>>>
>>>> On Jul 6, 2018, at 4:47 PM, Gokula Krishnan D <email2dgk@gmail.com>
>>>> wrote:
>>>>
>>>> Nirav,
>>>>
>>>> withColumnRenamed() API might help but it does not different column and
>>>> renames all the occurrences of the given column. either use select() API
>>>> and rename as you want.
>>>>
>>>>
>>>>
>>>> Thanks & Regards,
>>>> Gokula Krishnan* (Gokul)*
>>>>
>>>> On Mon, Jul 2, 2018 at 5:52 PM, Nirav Patel <npatel@xactlycorp.com>
>>>> wrote:
>>>>
>>>>> Expr is `df1(a) === df2(a) and df1(b) === df2(c)`
>>>>>
>>>>> How to avoid duplicate column 'a' in result? I don't see any api that
>>>>> combines both. Rename manually?
>>>>>
>>>>>
>>>>>
>>>>> [image: What's New with Xactly]
>>>>> <https://nam05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.xactlycorp.com%2Femail-click%2F&data=02%7C01%7C%7C8ab8d95c23f44dfb156708d5e381c938%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636665068928877949&sdata=p4D%2FKz%2B%2Fd8wWFg9EGtNMRNcnYk5LlZmjQKx0TeWleDE%3D&reserved=0>
>>>>>
>>>>>
>>>>> <https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.instagram.com%2Fxactlycorp%2F&data=02%7C01%7C%7C8ab8d95c23f44dfb156708d5e381c938%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636665068929034245&sdata=wtbLs3%2FABfsz8b1vN46EOcI22VZE1T5bhqOi9l1NFT0%3D&reserved=0>
>>>>>
>>>>> <https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fxactly-corporation&data=02%7C01%7C%7C8ab8d95c23f44dfb156708d5e381c938%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636665068929034245&sdata=vyQkePM9Y3zG94CKUFJNtuAcEk6M60AtvhOjsHxBhbY%3D&reserved=0>
>>>>>
>>>>> <https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2FXactly&data=02%7C01%7C%7C8ab8d95c23f44dfb156708d5e381c938%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636665068929034245&sdata=tRidhL1X4x4TPWdUHfH8%2Bcw8r7MT9jrRh1%2BJyU0LGCg%3D&reserved=0>
>>>>>
>>>>> <https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2FXactlyCorp&data=02%7C01%7C%7C8ab8d95c23f44dfb156708d5e381c938%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636665068929034245&sdata=kh0aKmjvcG1ox5%2FMjdI5Ib%2FMvTu4xomFPLUcWDyBir8%3D&reserved=0>
>>>>>
>>>>> <https://nam05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.youtube.com%2Fxactlycorporation&data=02%7C01%7C%7C8ab8d95c23f44dfb156708d5e381c938%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636665068929034245&sdata=sicYYnUCmLBbOnUpu2v3Mp7btkt%2FEGWVMHHC%2BqFIenE%3D&reserved=0>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>>>
>>> <https://www.instagram.com/xactlycorp/>
>>> <https://www.linkedin.com/company/xactly-corporation>
>>> <https://twitter.com/Xactly>   <https://www.facebook.com/XactlyCorp>
>>> <http://www.youtube.com/xactlycorporation>
>>
>>
>
>
>
> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>
> <https://www.instagram.com/xactlycorp/>
> <https://www.linkedin.com/company/xactly-corporation>
> <https://twitter.com/Xactly>   <https://www.facebook.com/XactlyCorp>
> <http://www.youtube.com/xactlycorporation>

Mime
View raw message