spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bjørn Jørgensen <bjornjorgen...@gmail.com>
Subject Re: Problems with update function in koalas - pyspark pandas.
Date Sun, 12 Sep 2021 10:39:29 GMT


https://issues.apache.org/jira/browse/SPARK-36722

https://github.com/apache/spark/pull/33968

On 2021/09/11 10:06:50, Bj��rn J��rgensen <bjornjorgensen@gmail.com> wrote:

> Hi I am using "from pyspark import pandas as ps" in a master build yesterday. 
> I do have some columns that I need to join to one. 
> In pandas I use update.
> 
>  
> 54   FD_OBJECT_SUPPLIES_SERVICES_OBJECT_SUPPLY_SERVICE_ADDITIONAL_INFORMATION       
                                                                                         
                       23 non-null      object 
> 55   FD_OBJECT_SUPPLIES_SERVICES_OBJECT_SUPPLY_SERVICE_ADDITIONAL_INFORMATION.P     
                                                                                         
                       24348 non-null   object
>  
>  
>  pd1['FD_OBJECT_SUPPLIES_SERVICES_OBJECT_SUPPLY_SERVICE_ADDITIONAL_INFORMATION'].update(pd1['FD_OBJECT_SUPPLIES_SERVICES_OBJECT_SUPPLY_SERVICE_ADDITIONAL_INFORMATION.P'])
>  
>  ---------------------------------------------------------------------------
> AssertionError                            Traceback (most recent call last)
> /tmp/ipykernel_73/391781247.py in <module>
> ----> 1 pd1['FD_OBJECT_SUPPLIES_SERVICES_OBJECT_SUPPLY_SERVICE_ADDITIONAL_INFORMATION'].update(pd1['FD_OBJECT_SUPPLIES_SERVICES_OBJECT_SUPPLY_SERVICE_ADDITIONAL_INFORMATION.P'])
> 
> /opt/spark/python/pyspark/pandas/series.py in update(self, other)
>    4549             raise TypeError("'other' must be a Series")
>    4550 
> -> 4551         combined = combine_frames(self._psdf, other._psdf, how="leftouter")
>    4552 
>    4553         this_scol = combined["this"]._internal.spark_column_for(self._column_label)
> 
> /opt/spark/python/pyspark/pandas/utils.py in combine_frames(this, how, preserve_order_column,
*args)
>     139     elif len(args) == 1 and isinstance(args[0], DataFrame):
>     140         assert isinstance(args[0], DataFrame)
> --> 141         assert not same_anchor(
>     142             this, args[0]
>     143         ), "We don't need to combine. `this` and `that` are same."
> 
> AssertionError: We don't need to combine. `this` and `that` are same.
> 
> 
> pd1.info()
> 
> 54   FD_OBJECT_SUPPLIES_SERVICES_OBJECT_SUPPLY_SERVICE_ADDITIONAL_INFORMATION       
                                                                                         
                       23 non-null      object 
> 55   FD_OBJECT_SUPPLIES_SERVICES_OBJECT_SUPPLY_SERVICE_ADDITIONAL_INFORMATION.P     
                                                                                         
                       24348 non-null   object
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message