spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laeeq Ahmed <laeeqsp...@yahoo.com>
Subject Taking value out from Dstream for each RDD
Date Fri, 09 May 2014 09:53:39 GMT
Hi all,

I want to calculate mean and SD for each RDD. I used the followoing code for mean and now
I have to use this mean for SD, but not sure how to use get these means for each RDD from
the DStream, so I can use it for SD. My sample files is as

1
2
3
4
5


The code is as

 val individualpoints = ssc.textFileStream(args(1))
val sumandcount = individualpoints.map(x => (x.toDouble, 1)).reduce{ (a, b) => (a._1
+ b._1, a._2 + b._2) }

sumandcount.foreachRDD(rdd => {
        rdd.collect.foreach({
                case (sum, n) => {
                        println("********************************************")
                        println("The Σ(x) is --> " + sum + " with
N --> " + n)
                        mean = sum/n
                        println("The μ is -->" + mean)
                        println("*********************************************")
                                }
                        })
                })

Regards,
Laeeq
Mime
View raw message