spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Spark DF does not rename the column
Date Mon, 04 Jan 2021 18:16:18 GMT
Thanks Jayesh for spotting that typo.


Cheers


Mich


LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*





*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 4 Jan 2021 at 18:09, Lalwani, Jayesh <jlalwani@amazon.com> wrote:

> You don’t have a column named “created”. The column name is “ceated”,
> without the “r”
>
>
>
> *From: *Mich Talebzadeh <mich.talebzadeh@gmail.com>
> *Date: *Monday, January 4, 2021 at 1:06 PM
> *To: *"user @spark" <user@spark.apache.org>
> *Subject: *[EXTERNAL] Spark DF does not rename the column
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi,
>
>
>
>  version 2.4.3
>
>
>
> I don't know the cause of this.
>
>
>
> This renaming of DF columns used to work fine. I did couple of changes to
> spark/Scala code not relevant to this table and it refuses to rename the
> columns for a table!.
>
>
>
> val summaryACC = HiveContext.table("summaryACC")
>
>
>
> summaryACC.printSchema()
>
>
>
> root
>
>  |-- ceated: string (nullable = true)
>
>  |-- hashtag: string (nullable = true)
>
>  |-- paid: float (nullable = true)
>
>  |-- received: float (nullable = true)
>
>
>
> summaryACC.
>
>     orderBy(desc("paid"),desc("received")).
>
>     withColumnRenamed("created","Date Calculated").
>
>     withColumnRenamed("hashtag","Who").
>
>     withColumn(("received"),format_number(col("received"),2)).
>
>     withColumn(("paid"),format_number(col("paid"),2)).
>
>     withColumnRenamed("paid","paid out/GBP").
>
>     withColumnRenamed("received","paid in/GBP").
>
>     withColumn("paid in/GBP",when(col("paid in/GBP") ===
> "0.00","--").otherwise(col("paid in/GBP"))).
>
>     withColumn("paid out/GBP",when(col("paid out/GBP") ===
> "0.00","--").otherwise(col("paid out/GBP"))).
>
>     select("Date Calculated","Who","paid in/GBP","paid
> out/GBP").show(1000,false)
>
>
>
> and this is the error
>
>
>
> org.apache.spark.sql.AnalysisException: cannot resolve '`Date Calculated`'
> given input columns: [alayer.summaryacc.ceated, Who, paid out/GBP, paid
> in/GBP];;
>
>
>
> This used to work before!
>
>
>
> +----------------------------+------------------+-----------+------------+
>
> |Date Calculated             |Who               |paid in/GBP|paid out/GBP|
>
> +----------------------------+------------------+-----------+------------+
>
> |Mon Jan 04 14:22:17 GMT 2021|paypal            |579.98     |1,526.86    |
>
>
>
> Appreciate any ideas.
>
>
>
> Thanks, Mich
>
>
>

Mime
View raw message