mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anne Sauve <Anne.Sa...@hotmail.com>
Subject computing the distance between 2 values from fuzzyKmeans clustering clusteredPoints
Date Tue, 09 Dec 2014 21:57:40 GMT
Hello there,

I have been trying for a while to compute the pairwise distance between all the 
clustered points in order to fill in a distanceMatrix which will then be used to 
compute the silhouette of my clustering.

Here is my code I am using Mahout 0.8

 SequenceFile.Reader reader1 = new SequenceFile.Reader(fs, new 
Path("data/testdata/output8Box/clusteredPoints" + "/part-m-00000"),conf);
 IntWritable key1 = new IntWritable();
 WeightedVectorWritable value1 = new WeightedVectorWritable();

 List<NamedVector> clusters = new ArrayList<NamedVector>();
 while (reader1.next(key1,value1)) {
	 System.out.println(value1.toString() + " belongs to cluster " + 
key1.toString());
	 NamedVector cluster = (NamedVector) value1.getVector();
	 clusters.add(cluster);
}
// Compute the distanceMatrix
DistanceMeasure measure = new CosineDistanceMeasure();
for (int i = 0; i< clusters.size(); i++) {
	for (int j = i + 1; j < clusters.size(); j++) {
		double d = measure.distance(clusters.get(i), clusters.get(j));
		System.out.println("dist "+i + " ; "+ j + " : "+ d);
	}
}
 

When I run it I am getting the following exception:

Exception in thread "main" java.lang.ClassCastException: 
org.apache.mahout.math.RandomAccessSparseVector cannot be cast to 
org.apache.mahout.math.NamedVector

How do I convert a sparse vector into a NamedVector ?
Is there a better way to proceed ?

Thanks a lot for your help.

Anne



Mime
View raw message