spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 高佳翔 <gaoj...@gmail.com>
Subject Re: Need help with String Concat Operation
Date Wed, 18 Oct 2017 08:20:21 GMT
Hi Debu,

First, Instead of using ‘+’, you can use ‘concat’ to concatenate string
columns. And you should enclose “0” with "lit()" to make it a column.
Second, 1440 become null because you didn’t tell spark what to do if the
when clause is failed. So it simply set the value to null. To fix this, you
should add “.otherwise()” right behind “when()”.

The code looks like this:

ctoff_df.withColumn("CTOFF_NEW",
  when(
    length(col("CTOFF")) ==3,
    concat(lit("0"), col("CTOFF"))
  ).otherwise(
    col("CTOFF")
  ))

Best,

JiaXiang
​

On Wed, Oct 18, 2017 at 2:17 PM, Debabrata Ghosh <mailfordebu@gmail.com>
wrote:

> Hi,
>              I am having a dataframe column (name of the column is CTOFF)
> and I intend to prefix with '0' in case the length of the column is 3.
> Unfortunately, I am unable to acheive my goal and wonder whether you can
> help me here.
>
> Command which I am executing:
>
> ctoff_dedup_prep_temp = ctoff_df.withColumn('CTOFF_NEW',when(length(col('CTOFF'))
> == 3,'0'+col('CTOFF')))
> ctoff_dedup_prep_temp.show()
>
> +------------+----------+----------+--------------------+---
> ------+-----+---------+
> |EVNT_SRVC_AR|EVNT_FCLTY|EVNT_TP_CD|        NTWRK_PRD_CD|
> DY_OF_WK|CTOFF|CTOFF_NEW|
> +------------+----------+----------+--------------------+---
> ------+-----+---------+
> |         HKG|       HKC|        AR|2,3,7,8,C,D,J,P,Q...|1,2,3,4,5|
> 1440|     null|
> |         HKG|       HKC|        AR|             C,Q,T,Y|1,2,3,4,5|
> 730|    730.0|
> |         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|
> 600|    600.0|
> |         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|
> 900|    900.0|
> +------------+----------+----------+--------------------+---
> ------+-----+---------+
>
> The result which I want is:
> +------------+----------+----------+--------------------+---
> ------+-----+---------+
> |EVNT_SRVC_AR|EVNT_FCLTY|EVNT_TP_CD|        NTWRK_PRD_CD|
> DY_OF_WK|CTOFF|CTOFF_NEW|
> +------------+----------+----------+--------------------+---
> ------+-----+---------+
> |         HKG|       HKC|        AR|2,3,7,8,C,D,J,P,Q...|1,2,3,4,5|
> 1440|    1440|
> |         HKG|       HKC|        AR|             C,Q,T,Y|1,2,3,4,5|
> 730|    0730|
> |         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|
> 600|    0600|
> |         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|
> 900|    0900|
> +------------+----------+----------+--------------------+---
> ------+-----+---------+
>
> So I want the '0' to be prefixed but it's getting suffixed as '.0'. Any
> clue around why is this happening
>
> Thanks,
>
> Debu
>



-- 
Gao JiaXiang
Data Analyst, GCBI <http://www.gcbi.com.cn>

Mime
View raw message