mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kris Jack <mrkrisj...@gmail.com>
Subject Re: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
Date Thu, 10 Jun 2010 14:20:55 GMT
Got a little further by making some more class changes...

//
public class GenSimMatrixJob extends AbstractJob {

    public GenSimMatrixJob() {

    }

    @Override
    public int run(String[] strings) throws Exception {
        addOption("numDocs", "nd", "Number of documents in the input");
        addOption("numTerms", "nt", "Number of terms in the input");

        Map<String,String> parsedArgs = parseArguments(strings);
        if (parsedArgs == null) {
          // FIXME
          return 0;
        }

        Configuration originalConf = getConf();
        String inputPathString = originalConf.get("mapred.input.dir");
        String outputTmpPathString = parsedArgs.get("--tempDir");
        int numDocs = Integer.parseInt(parsedArgs.get("--numDocs"));
        int numTerms = Integer.parseInt(parsedArgs.get("--numTerms"));

        DistributedRowMatrix text = new
DistributedRowMatrix(inputPathString,
                outputTmpPathString, numDocs, numTerms);

        text.configure(new JobConf(getConf()));

        DistributedRowMatrix transpose = text.transpose();

        DistributedRowMatrix similarity = transpose.times(transpose);

        System.out.println("Similarity matrix lives: " +
similarity.getRowPath());

        return 1;
    }

    public static void main(String[] args) throws Exception {
        ToolRunner.run(new GenSimMatrixJob(), args);
    }

}
//

Giving the error...

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.
10-Jun-2010 15:16:28 org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Initializing JVM Metrics with processName=JobTracker, sessionId=
10-Jun-2010 15:16:28 org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
10-Jun-2010 15:16:28 org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: No job jar file set.  User classes may not be found. See
JobConf(Class) or JobConf#setJar(String).
10-Jun-2010 15:16:28 org.apache.hadoop.mapred.FileInputFormat listStatus
INFO: Total input paths to process : 1
10-Jun-2010 15:16:28 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
INFO: Running job: job_local_0001
10-Jun-2010 15:16:28 org.apache.hadoop.mapred.FileInputFormat listStatus
INFO: Total input paths to process : 1
10-Jun-2010 15:16:28 org.apache.hadoop.util.NativeCodeLoader <clinit>
WARNING: Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable
10-Jun-2010 15:16:28 org.apache.hadoop.io.compress.CodecPool getDecompressor
INFO: Got brand-new decompressor
10-Jun-2010 15:16:28 org.apache.hadoop.mapred.MapTask runOldMapper
INFO: numReduceTasks: 1
10-Jun-2010 15:16:28 org.apache.hadoop.mapred.MapTask$MapOutputBuffer <init>
INFO: io.sort.mb = 100
10-Jun-2010 15:16:29 org.apache.hadoop.mapred.MapTask$MapOutputBuffer <init>
INFO: data buffer = 79691776/99614720
10-Jun-2010 15:16:29 org.apache.hadoop.mapred.MapTask$MapOutputBuffer <init>
INFO: record buffer = 262144/327680
10-Jun-2010 15:16:29 org.apache.hadoop.mapred.LocalJobRunner$Job run
WARNING: job_local_0001
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
cast to org.apache.hadoop.io.IntWritable
    at
org.apache.mahout.math.hadoop.TransposeJob$TransposeMapper.map(TransposeJob.java:1)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
10-Jun-2010 15:16:29 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
INFO:  map 0% reduce 0%
10-Jun-2010 15:16:29 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
INFO: Job complete: job_local_0001
10-Jun-2010 15:16:29 org.apache.hadoop.mapred.Counters log
INFO: Counters: 0



2010/6/10 Kris Jack <mrkrisjack@gmail.com>

> In the attempt to create a document-document similarity matrix, I am
> getting the following error:
>
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
> details.
> 10-Jun-2010 13:25:04 org.apache.hadoop.metrics.jvm.JvmMetrics init
> INFO: Initializing JVM Metrics with processName=JobTracker, sessionId=
> 10-Jun-2010 13:25:04 org.apache.hadoop.mapred.JobClient
> configureCommandLineOptions
> WARNING: Use GenericOptionsParser for parsing the arguments. Applications
> should implement Tool for the same.
> 10-Jun-2010 13:25:04 org.apache.hadoop.mapred.JobClient
> configureCommandLineOptions
> WARNING: No job jar file set.  User classes may not be found. See
> JobConf(Class) or JobConf#setJar(String).
> 10-Jun-2010 13:25:04 org.apache.hadoop.mapred.FileInputFormat listStatus
> INFO: Total input paths to process : 1
> 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
> INFO: Running job: job_local_0001
> 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.FileInputFormat listStatus
> INFO: Total input paths to process : 1
> 10-Jun-2010 13:25:05 org.apache.hadoop.util.NativeCodeLoader <clinit>
> WARNING: Unable to load native-hadoop library for your platform... using
> builtin-java classes where applicable
> 10-Jun-2010 13:25:05 org.apache.hadoop.io.compress.CodecPool
> getDecompressor
> INFO: Got brand-new decompressor
> 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask runOldMapper
> INFO: numReduceTasks: 1
> 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask$MapOutputBuffer
> <init>
> INFO: io.sort.mb = 100
> 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask$MapOutputBuffer
> <init>
> INFO: data buffer = 79691776/99614720
> 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask$MapOutputBuffer
> <init>
> INFO: record buffer = 262144/327680
> 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.LocalJobRunner$Job run
> WARNING: job_local_0001
> java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
> cast to org.apache.hadoop.io.IntWritable
>     at
> org.apache.mahout.math.hadoop.TransposeJob$TransposeMapper.map(TransposeJob.java:1)
>     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>     at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 10-Jun-2010 13:25:06 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
> INFO:  map 0% reduce 0%
> 10-Jun-2010 13:25:06 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
> INFO: Job complete: job_local_0001
> 10-Jun-2010 13:25:06 org.apache.hadoop.mapred.Counters log
> INFO: Counters: 0
> Exception in thread "main" java.lang.RuntimeException: java.io.IOException:
> Job failed!
>     at
> org.apache.mahout.math.hadoop.DistributedRowMatrix.transpose(DistributedRowMatrix.java:163)
>     at
> org.apache.mahout.math.hadoop.GenSimMatrixLocal.generateMatrix(GenSimMatrixLocal.java:24)
>     at
> org.apache.mahout.math.hadoop.GenSimMatrixLocal.main(GenSimMatrixLocal.java:34)
> Caused by: java.io.IOException: Job failed!
>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
>     at
> org.apache.mahout.math.hadoop.DistributedRowMatrix.transpose(DistributedRowMatrix.java:158)
>     ... 2 more
>
>
> I created a test solr index with 3 documents and generated a sparse feature
> matrix out of it using mahout's
> org.apache.mahout.utils.vectors.lucene.Driver.
>
> I then ran the following code using the sparse feature matrix as input
> (mahoutIndexTFIDF.vec).
>
> {
>     private void generateMatrix() {
>         String inputPath = "/home/kris/data/mahoutIndexTFIDF.vec";
>         String tmpPath = "/tmp/matrixMultiplySpace";
>         int numDocuments = 3;
>         int numTerms = 4;
>
>         DistributedRowMatrix text = new DistributedRowMatrix(inputPath,
>           tmpPath, numDocuments, numTerms);
>
>         JobConf conf = new JobConf("similarity job");
>         text.configure(conf);
>
>         DistributedRowMatrix transpose = text.transpose();
>
>         DistributedRowMatrix similarity = transpose.times(transpose);
>
>         System.out.println("Similarity matrix lives: " +
> similarity.getRowPath());
>     }
>
>     public static void main (String [] args) {
>         GenSimMatrixLocal similarity = new GenSimMatrixLocal();
>
>         similarity.generateMatrix();
>     }
> }
>
> Anyone see why there is a problem between LongWritable and IntWritable
> casting?  Does it need to be configured differently?
>
> Thanks,
> Kris
>
>
>
>


-- 
Dr Kris Jack,
http://www.mendeley.com/profiles/kris-jack/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message