mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johanna Del Pino <johanna.delp...@gmail.com>
Subject Problem reading Spectral KMeans Clustering Output
Date Mon, 25 Nov 2013 19:05:19 GMT
Hello,

I'm using Spectral KMeans Clustering to generate 15 clusters. My input data
are 360 vector of 90 dimensions each. I have built the affinity matrix
(360x360) and run the SpectralKMeansDriver using this matrix without
problems. Right know I'm trying to understand my output data (See in which
cluster each vector has been assigned), however the output looks somehow
different to what normal KMeans Clustering output looks like. So I would
like to know how to read my output in this case.

This is my code:
--------------------

Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
Path output = new Path("output");
HadoopUtil.delete(conf, output);


    SpectralKMeansDriver.run(conf, new Path("testdata"), new
Path("output"),360,15, new EuclideanDistanceMeasure(), 0.001, 10,new
Path("tmpDir"),true);

    SequenceFile.Reader reader = new SequenceFile.Reader(fs,new
Path("output/kmeans_out/clusteredPoints/part-m-00000"), conf);

    IntWritable key = new IntWritable();
    WeightedVectorWritable value = new WeightedVectorWritable();
     while (reader.next(key, value)) {
       System.out.println(value.toString() + " belongs to cluster " +
key.toString());
     }
     reader.close();
    }
 This is part of my output:
----------------------------------------
 1.0: [0.207, -0.011, -0.052, 0.074, -0.107, -0.463, -0.411, -0.234,
-0.412, -0.527, 0.093, 0.006, -0.081, -0.143, -0.127] belongs to cluster 5
1.0: [0.185, -0.054, -0.091, 0.039, 0.132, -0.274, 0.235, -0.298, -0.191,
-0.159, 0.176, 0.253, 0.577, 0.090, 0.468] belongs to cluster 12
1.0: [0.213, -0.091, -0.016, 0.051, -0.032, -0.346, -0.308, -0.173, -0.521,
-0.492, 0.038, -0.087, -0.246, -0.068, -0.332] belongs to cluster 5
1.0: [0.220, -0.015, -0.193, -0.042, 0.164, -0.479, -0.160, -0.333, -0.483,
-0.333, 0.068, 0.070, 0.258, -0.031, 0.314] belongs to cluster 12
1.0: [0.253, -0.057, 0.038, 0.136, -0.197, -0.247, -0.539, -0.506, -0.107,
-0.343, 0.154, -0.002, 0.092, 0.082, -0.313] belongs to cluster 5
1.0: [0.319, -0.004, 0.303, 0.101, -0.218, -0.562, -0.372, 0.082, -0.221,
-0.117, -0.329, -0.152, -0.021, -0.101, 0.287] belongs to cluster 5
1.0: [0.240, 0.154, -0.070, -0.022, -0.397, -0.482, -0.328, 0.020, -0.365,
-0.273, -0.195, 0.022, 0.345, 0.025, 0.223] belongs to cluster 12
1.0: [0.232, 0.125, -0.017, 0.012, -0.250, -0.505, -0.451, 0.010, -0.169,
-0.276, 0.055, 0.196, -0.469, -0.110, -0.175] belongs to cluster 5
1.0: [0.236, 0.074, -0.039, 0.035, -0.260, -0.608, -0.376, -0.094, -0.311,
-0.315, -0.116, 0.157, -0.296, -0.151, -0.067] belongs to cluster 5
1.0: [0.217, 0.113, -0.088, 0.016, -0.192, -0.514, -0.359, 0.047, -0.329,
-0.187, -0.026, 0.256, -0.504, -0.157, -0.107] belongs to cluster 5
1.0: [0.263, 0.132, 0.001, 0.153, -0.261, -0.714, -0.272, -0.171, -0.369,
-0.224, -0.041, 0.069, -0.006, -0.120, -0.043] belongs to cluster 5
1.0: [0.237, 0.141, 0.088, 0.140, -0.331, -0.495, -0.084, -0.064, -0.193,
-0.264, -0.088, -0.000, 0.637, 0.080, 0.065] belongs to cluster 12
1.0: [0.242, -0.213, 0.098, 0.024, 0.379, 0.495, -0.039, -0.383, -0.017,
-0.334, -0.125, 0.396, 0.069, -0.087, -0.229] belongs to cluster 1
 Normally what happens in KMeans Clustering is that each line in my output
file represents one vector from my input file and its cluster. But now with
Spectral KMeans Clustering
these lines do not represent neither my vectors nor the data in the
affinity matrix so I'm not sure how to map this information with my 360
initial vectors.

Thank you,
Johanna

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message