spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anbutech <anbutec...@outlook.com>
Subject Spark 2.2 With Column usage
Date Sat, 08 Jun 2019 04:05:27 GMT
Hi Sir,

Could you please advise to fix the below issue in the withColumn in the
spark 2.2 scala 2.11 joins

def processing(spark:SparkSession,

dataset1:Dataset[Reference],

dataset2:Dataset[DataCore],

dataset3:Dataset[ThirdPartyData] ,

dataset4:Dataset[OtherData]

date:String):Dataset[DataMerge] {

val referenceFiltered = dataset2.filter(.dataDate ==
date).filter.someColumn).select("id").toString

dataset1.as("t1)

join(dataset3.as("t2"),

col(t1.col1) === col(t2.col1), JOINTYPE.Inner )

.join(dataset4.as("t3"), col(t3.col1) === col(t1.col1),

JOINTYPE.Inner)

.withColumn("new_column",lit(referenceFiltered))

.selectexpr(

"id", -------------------> want to get this value

"column1,

"column2,

"column3",

"column4" )

}

how do i get the String value ,let say the value"124567"
("referenceFiltered") inside the withColumn?

im getting the withColumn output as "id:BigInt" . I want to get the same
value for all the records.

Note:

I have asked not use cross join in the code. Any other way to fix this
issue.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message