spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mgCl2 <florent.jouante...@gmail.com>
Subject k-mean - result interpretation
Date Thu, 30 Oct 2014 17:35:22 GMT
Hello everyone,

I'm trying to use MLlib's K-mean algorithm.

I tried it on raw data, Here is a example of a line contained in my input
data set:
82.9817 3281.4495

with those parameters:
*numClusters*=4
*numIterations*=20

results:
*WSSSE = 6.375371241589461E9*

Then I normalized my data:
0.02219046937793337492 0.97780953062206662508
With the same parameters, result is now:
 *WSSSE= 0.04229916511906393*

Is it normal that normalization improve my results?
Why isn't the WSSSE normalized? Because it seems that having smaller values
end to a smaller WSSSE
I'm sure I missed something here!

Florent







--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/k-mean-result-interpretation-tp17748.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message