mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kris Jack <mrkrisj...@gmail.com>
Subject java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
Date Thu, 10 Jun 2010 12:28:13 GMT
In the attempt to create a document-document similarity matrix, I am getting
the following error:

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.
10-Jun-2010 13:25:04 org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Initializing JVM Metrics with processName=JobTracker, sessionId=
10-Jun-2010 13:25:04 org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
10-Jun-2010 13:25:04 org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: No job jar file set.  User classes may not be found. See
JobConf(Class) or JobConf#setJar(String).
10-Jun-2010 13:25:04 org.apache.hadoop.mapred.FileInputFormat listStatus
INFO: Total input paths to process : 1
10-Jun-2010 13:25:05 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
INFO: Running job: job_local_0001
10-Jun-2010 13:25:05 org.apache.hadoop.mapred.FileInputFormat listStatus
INFO: Total input paths to process : 1
10-Jun-2010 13:25:05 org.apache.hadoop.util.NativeCodeLoader <clinit>
WARNING: Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable
10-Jun-2010 13:25:05 org.apache.hadoop.io.compress.CodecPool getDecompressor
INFO: Got brand-new decompressor
10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask runOldMapper
INFO: numReduceTasks: 1
10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask$MapOutputBuffer <init>
INFO: io.sort.mb = 100
10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask$MapOutputBuffer <init>
INFO: data buffer = 79691776/99614720
10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask$MapOutputBuffer <init>
INFO: record buffer = 262144/327680
10-Jun-2010 13:25:05 org.apache.hadoop.mapred.LocalJobRunner$Job run
WARNING: job_local_0001
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
cast to org.apache.hadoop.io.IntWritable
    at
org.apache.mahout.math.hadoop.TransposeJob$TransposeMapper.map(TransposeJob.java:1)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
10-Jun-2010 13:25:06 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
INFO:  map 0% reduce 0%
10-Jun-2010 13:25:06 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
INFO: Job complete: job_local_0001
10-Jun-2010 13:25:06 org.apache.hadoop.mapred.Counters log
INFO: Counters: 0
Exception in thread "main" java.lang.RuntimeException: java.io.IOException:
Job failed!
    at
org.apache.mahout.math.hadoop.DistributedRowMatrix.transpose(DistributedRowMatrix.java:163)
    at
org.apache.mahout.math.hadoop.GenSimMatrixLocal.generateMatrix(GenSimMatrixLocal.java:24)
    at
org.apache.mahout.math.hadoop.GenSimMatrixLocal.main(GenSimMatrixLocal.java:34)
Caused by: java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
    at
org.apache.mahout.math.hadoop.DistributedRowMatrix.transpose(DistributedRowMatrix.java:158)
    ... 2 more


I created a test solr index with 3 documents and generated a sparse feature
matrix out of it using mahout's
org.apache.mahout.utils.vectors.lucene.Driver.

I then ran the following code using the sparse feature matrix as input
(mahoutIndexTFIDF.vec).

{
    private void generateMatrix() {
        String inputPath = "/home/kris/data/mahoutIndexTFIDF.vec";
        String tmpPath = "/tmp/matrixMultiplySpace";
        int numDocuments = 3;
        int numTerms = 4;

        DistributedRowMatrix text = new DistributedRowMatrix(inputPath,
          tmpPath, numDocuments, numTerms);

        JobConf conf = new JobConf("similarity job");
        text.configure(conf);

        DistributedRowMatrix transpose = text.transpose();

        DistributedRowMatrix similarity = transpose.times(transpose);

        System.out.println("Similarity matrix lives: " +
similarity.getRowPath());
    }

    public static void main (String [] args) {
        GenSimMatrixLocal similarity = new GenSimMatrixLocal();

        similarity.generateMatrix();
    }
}

Anyone see why there is a problem between LongWritable and IntWritable
casting?  Does it need to be configured differently?

Thanks,
Kris

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message