mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Grier <gr...@imchris.org>
Subject LDA clustering example not working
Date Fri, 02 Dec 2011 18:48:56 GMT
Hi,

I'm trying to get the LDA example working. I'm working out of svn, revision
#1209631 (current as of right now). The k-means example works fine.

I have hadoop running and the configs setup properly for running MR jobs
and using HDFS. I'm starting from ./cluster-reuters.sh and choosing LDA ---
The first step in the example script, seq2sparse, completes successfully.
The second step, actually running LDA, does not. It's erroring out with the
following message:

Exception in thread "main" java.lang.IllegalStateException:
hdfs://hostname:54310/tmp/mahout-work-hadoop/reuters-out-seqdir-sparse-lda/tf-vectors/_logs
        at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator$1.apply(SequenceFileDirValueIterator.java:82)
        at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator$1.apply(SequenceFileDirValueIterator.java:73)
        at com.google.common.collect.Iterators$8.next(Iterators.java:765)
        at com.google.common.collect.Iterators$5.hasNext(Iterators.java:526)
        at
com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
        at
org.apache.mahout.clustering.lda.LDADriver.determineNumberOfWordsFromFirstVector(LDADriver.java:204)
        at
org.apache.mahout.clustering.lda.LDADriver.run(LDADriver.java:164)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at
org.apache.mahout.clustering.lda.LDADriver.main(LDADriver.java:90)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.io.IOException: Cannot open filename
/tmp/mahout-work-hadoop/reuters-out-seqdir-sparse-lda/tf-vectors/_logs
        at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1526)
        at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1517)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:384)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178)
        at
org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1444)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1431)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
        at
org.apache.mahout.common.iterator.sequencefile.SequenceFileValueIterator.<init>(SequenceFileValueIterator.java:51)
        at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator$1.apply(SequenceFileDirValueIterator.java:77)
        ... 20 more

-Chris

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message