spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bradc <brad.carl...@oracle.com>
Subject Re: Design document - MLlib's statistical package for DataFrames
Date Thu, 16 Feb 2017 22:21:31 GMT
Hi,

While it is also missing in spark.mllib, I'd suggest adding cardinality as
part of the Simple descriptive statistics for both spark.ml and spark.mlib? 
This is useful even for data in double precision FP to understand the
"uniqueness" of the feature data.

Cheers,
Brad




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Design-document-MLlib-s-statistical-package-for-DataFrames-tp21014p21016.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message