mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Severance, Steve" <ssevera...@ebay.com>
Subject Seq2Sparse Exception with nGrams
Date Sat, 04 Sep 2010 00:39:43 GMT
When I set nGrams to a number more than 1 (I have tried to and 3) I get the following exception.

Here Is my command line.
 ./mahout seq2sparse -i <input< -a org.apache.lucene.analysis.WhitespaceAnalyzer -o
<output> -x 60 -wt TFIDF -ng 2 -ow

10/09/03 17:35:22 INFO mapred.JobClient: Task Id : attempt_201007221306_12175_m_000013_0,
Status : FAILED
java.lang.NullPointerException
        at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:86)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.mahout.utils.nlp.collocations.llr.Gram.write(Gram.java:181)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
        at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:179)
        at org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:880)
        at org.apache.hadoop.mapred.Task$NewCombinerRunner$OutputConverter.write(Task.java:1197)
        at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at org.apache.mahout.utils.nlp.collocations.llr.CollocCombiner.reduce(CollocCombiner.java:40)
        at org.apache.mahout.utils.nlp.collocations.llr.CollocCombiner.reduce(CollocCombiner.java:25)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
        at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1217)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1227)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1091)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:512)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:585)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)


Is this a known issue? nGrams worked with 0.3.

Thanks.

Steve

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message