spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: spark sql median and standard deviation
Date Wed, 04 Mar 2015 21:47:31 GMT
Please take a look at DoubleRDDFunctions.scala :

  /** Compute the mean of this RDD's elements. */
  def mean(): Double = stats().mean

  /** Compute the variance of this RDD's elements. */
  def variance(): Double = stats().variance

  /** Compute the standard deviation of this RDD's elements. */
  def stdev(): Double = stats().stdev

Cheers

On Wed, Mar 4, 2015 at 10:51 AM, tridib <tridib.samanta@live.com> wrote:

> Hello,
> Is there in built function for getting median and standard deviation in
> spark sql? Currently I am converting the schemaRdd to DoubleRdd and calling
> doubleRDD.stats(). But still it does not have median.
>
> What is the most efficient way to get the median?
>
> Thanks & Regards
> Tridib
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/spark-sql-median-and-standard-deviation-tp21914.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message