mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <j...@windwardsolutions.com>
Subject Re: Clustering error
Date Mon, 04 Feb 2013 22:31:20 GMT
Kinda looks like you didn't specify the right input file. That job 
expects the delimited values from the synthetic control download, 
converts them to vectors and clusters them. The vectors are of 
cardinality 60 but somehow your input data generated 1151 elements. I'd 
look there.


On 2/4/13 2:13 PM, Aysu Ezen wrote:
> Hi everyone,
>
> I am having difficulty running the clustering example at:
> https://cwiki.apache.org/MAHOUT/clustering-of-synthetic-control-data.html
>
> I have followed all the steps but getting the error:
> org.apache.mahout.math.CardinalityException: Required cardinality 60 but
> got 1151
> at org.apache.mahout.math.AbstractVector.dot(AbstractVector.java:112)
> at
> org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure.distance(SquaredEuclideanDistanceMeasure.java:57)
> at
> org.apache.mahout.common.distance.EuclideanDistanceMeasure.distance(EuclideanDistanceMeasure.java:39)
> at
> org.apache.mahout.clustering.canopy.CanopyClusterer.addPointToCanopies(CanopyClusterer.java:153)
> at
> org.apache.mahout.clustering.canopy.CanopyReducer.reduce(CanopyReducer.java:46)
> at
> org.apache.mahout.clustering.canopy.CanopyReducer.reduce(CanopyReducer.java:29)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:650)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> 13/02/04 14:11:41 INFO mapred.JobClient: Job complete: job_local_0002
> 13/02/04 14:11:41 INFO mapred.JobClient: Counters: 16
> 13/02/04 14:11:41 INFO mapred.JobClient:   FileSystemCounters
> 13/02/04 14:11:41 INFO mapred.JobClient:     FILE_BYTES_READ=259999568
> 13/02/04 14:11:41 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=245776500
> 13/02/04 14:11:41 INFO mapred.JobClient:   File Input Format Counters
> 13/02/04 14:11:41 INFO mapred.JobClient:     Bytes Read=1485833
> 13/02/04 14:11:41 INFO mapred.JobClient:   Map-Reduce Framework
> 13/02/04 14:11:41 INFO mapred.JobClient:     Map output materialized
> bytes=66080
> 13/02/04 14:11:41 INFO mapred.JobClient:     Map input records=3552
> 13/02/04 14:11:41 INFO mapred.JobClient:     Reduce shuffle bytes=0
> 13/02/04 14:11:41 INFO mapred.JobClient:     Spilled Records=101
> 13/02/04 14:11:41 INFO mapred.JobClient:     Map output bytes=65646
> 13/02/04 14:11:41 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=929648640
> 13/02/04 14:11:41 INFO mapred.JobClient:     SPLIT_RAW_BYTES=535
> 13/02/04 14:11:41 INFO mapred.JobClient:     Combine input records=0
> 13/02/04 14:11:41 INFO mapred.JobClient:     Reduce input records=0
> 13/02/04 14:11:41 INFO mapred.JobClient:     Reduce input groups=0
> 13/02/04 14:11:41 INFO mapred.JobClient:     Combine output records=0
> 13/02/04 14:11:41 INFO mapred.JobClient:     Reduce output records=0
> 13/02/04 14:11:41 INFO mapred.JobClient:     Map output records=101
> *Exception in thread "main" java.lang.InterruptedException: Canopy Job
> failed processing output/data*
> at
> org.apache.mahout.clustering.canopy.CanopyDriver.buildClustersMR(CanopyDriver.java:349)
> at
> org.apache.mahout.clustering.canopy.CanopyDriver.buildClusters(CanopyDriver.java:236)
> at
> org.apache.mahout.clustering.canopy.CanopyDriver.run(CanopyDriver.java:145)
> at
> org.apache.mahout.clustering.canopy.CanopyDriver.run(CanopyDriver.java:160)
> at org.apache.mahout.clustering.syntheticcontrol.canopy.Job.run(Job.java:86)
> at
> org.apache.mahout.clustering.syntheticcontrol.canopy.Job.main(Job.java:54)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
> I would appreciate if you could help.
>


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message