spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Hunter <timhun...@databricks.com>
Subject Re: Design document - MLlib's statistical package for DataFrames
Date Fri, 17 Feb 2017 18:48:59 GMT
Hi Brad,

this task is focusing on moving the existing algorithms, so that we
are held up by parity issues.

Do you have some paper suggestions for cardinality? I do not think
there is a feature request on JIRA either.

Tim

On Thu, Feb 16, 2017 at 2:21 PM, bradc <brad.carlile@oracle.com> wrote:
> Hi,
>
> While it is also missing in spark.mllib, I'd suggest adding cardinality as
> part of the Simple descriptive statistics for both spark.ml and spark.mlib?
> This is useful even for data in double precision FP to understand the
> "uniqueness" of the feature data.
>
> Cheers,
> Brad
>
>
>
>
> --
> View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Design-document-MLlib-s-statistical-package-for-DataFrames-tp21014p21016.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message