mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jung hoon sohn <jsoh...@gmail.com>
Subject KmeansDriver Question
Date Sat, 15 Sep 2012 07:29:30 GMT
Hello, I am trying to cluster the input data using KmeansDriver.
The input vector is transformed from the lucene vector using the
"bin/mahout lucene.vector ..." commands and when I run the
KmeansDriver using the run method, I get

12/09/15 15:18:13 INFO mapred.JobClient: Task Id :
attempt_201209121951_0067_m_000000_1, Status : FAILED
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
cast to org.apache.hadoop.io.Text
        at
org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper.map(SequenceFileTokenizerMapper.java:37)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)

for several attempts but the process goes on and generates the output data.
I can even run the clusterdump using the output cluster data however I am
concerned about the effect of above errors.

Please help me to get through the problem.

Thanks.

Jung Hoon

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message