spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Femi Anthony <femib...@gmail.com>
Subject Pass row to UDF and select column based on pattern match
Date Tue, 09 Jul 2019 18:25:21 GMT
How can I achieve the following by passing a row to a udf ?

    val df1 = df.withColumn("col_Z", 
                  when($"col_x" === "a", $"col_A")
                  .when($"col_x" === "b", $"col_B")
                  .when($"col_x" === "c", $"col_C")
                  .when($"col_x" === "d", $"col_D")
                  .when($"col_x" === "e", $"col_E")
                  .when($"col_x" === "f", $"col_F")
                  .when($"col_x" === "g", $"col_G")
		  )

As I understand it, only columns can be passed as arguments to a UDF in Scala Spark.

I have taken a look at this question:

https://stackoverflow.com/questions/31816975/how-to-pass-whole-row-to-udf-spark-dataframe-filter

and tried to implement this udf:

    def myUDF(r:Row) = udf {
  
     val z : Float = r.getAs("col_x") match {
          case "a" => r.getAs("col_A")
          case "b" => r.getAs("col_B")
          case other => lit(0.0)
       }
     z
    }

but I'm getting a type mismatch error:


     error: type mismatch;
     found   : String("a")
     required: Nothing
     case "a" => r.getAs("col_A")
          ^

What am I doing wrong ?


Sent from my iPhone
Mime
View raw message