spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Multidimensional K-Means
Date Sun, 15 Feb 2015 16:29:27 GMT
Clustering operates on a large number of n-dimensional vectors. That
seems to be what you are describing, and that is what the MLlib API
accepts. What are you expecting that you don't find?

Did you have a look at the KMeansModel that this method returns? it
has a "clusterCenters" method that gives you what you're looking for.
Explore the API a bit more first.

On Sun, Feb 15, 2015 at 4:26 PM, Attila Tóth <atezs82@gmail.com> wrote:
> Dear Spark User List,
>
> I'm fairly new to Spark, trying to use it for multi-dimensional clustering
> (using the k-means clustering from MLib). However, based on the examples the
> clustering seems to work only for a single dimension (KMeans.train() accepts
> an RDD[Vector], which is a vector of doubles - I have a list of array of
> doubles, eg. a list of n-dimensional coordinates).
>
> Is there any way with which, given a list of arrays (or vectors) of doubles,
> I can get out the list of cluster centres (as a list of n-dimensional
> coordinates) in Spark?
>
> I'm using Scala.
>
> Thanks in advance,
> Attila

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message