spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Attila Tóth <>
Subject Multidimensional K-Means
Date Sun, 15 Feb 2015 16:26:11 GMT
Dear Spark User List,

I'm fairly new to Spark, trying to use it for multi-dimensional clustering
(using the k-means clustering from MLib). However, based on the examples
the clustering seems to work only for a single dimension (KMeans.train()
accepts an RDD[Vector], which is a vector of doubles - I have a list of
array of doubles, eg. a list of n-dimensional coordinates).

Is there any way with which, given a list of arrays (or vectors) of
doubles, I can get out the list of cluster centres (as a list of
n-dimensional coordinates) in Spark?

I'm using Scala.

Thanks in advance,

View raw message