spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From talgr <>
Subject dense_rank skips ranks on cube
Date Mon, 20 Jun 2016 14:00:28 GMT
I have a dataframe with 7 dimensions,
I built a cube on them

val cube = df.cube('d1,'d2,'d3,'d4,'d5,'d6,'d7)
val cc = cube.agg(sum('p1).as("p1"),sum('p2).as("p2")).cache

and then defined a rank function on a window:

 val rankSpec =
 val grank = dense_rank().over(rankSpec)
 val cubed = cc.withColumn("rank",grank)

when I do: 
cubed.filter('d1.isNull && 'd2.isNull && 'd3.isNull && 'd4.isNull
'd5.isNull && 'd6.isNull && 'd7.isNotNull).sort('rank).show

i see that the first ranks are 3,5,9,10,11,12,13,15...

it seems that they becomes more dense on higher ranks.
Any idea?


View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message