spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Divya Gehlot <divya.htco...@gmail.com>
Subject [Spark 1.6]-increment value column based on condition + Dataframe
Date Tue, 09 Aug 2016 12:34:00 GMT
Hi,
I have column values having values like
Value
30
12
56
23
12
16
12
89
12
5
6
4
8

I need create another column
if col("value") > 30  1 else col("value") < 30
newColValue
0
1
0
1
2
3
4
0
1
2
3
4
5

How can I have create an increment column
The grouping is happening based on some other cols which is not mentioned
here.
When I try Windows sum function ,its summing but instead of incrementing it
the total sum is getting displayed in all the rows .
val overWin = Window.partitionBy('col1,'col2,'col3).orderBy('Value)
val total = sum('Value).over(overWin)

With this logic
I am getting the below result
0
1
0
4
4
4
4
0
5
5
5
5
5

Written my own UDF also but customized UDF is not supported in windows
function in Spark 1.6

Would really appreciate the help.


Thanks,
Divya




Am I missing something

Mime
View raw message