spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SiMaYunRui <>
Subject Percentile example
Date Sun, 15 Feb 2015 15:37:09 GMT
I am a newbie to spark and trying to figure out how to get percentile against a big data set.
Actually, I googled this topic but not find any very useful code example and explanation.
Seems that I can use transformer SortBykey to get my data set in order, but not pretty sure
how can I get value of , for example, percentile 66. 
Should I use take() to pick up the value of percentile 66? I don't believe any machine can
load my data set in memory. I believe there must be more efficient approaches. 
Can anyone shed some light on this problem? 
View raw message