mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Reggiani <nop...@gmail.com>
Subject Running CVB command
Date Fri, 08 Feb 2013 18:52:11 GMT
Hello,

following this tutorial
https://cwiki.apache.org/MAHOUT/dimensional-reduction.html

I have created successfully my matrix from Reuters data. Since I want to
test CVB algorithm I run this command

$MAHOUT_HOME/bin/mahout cvb -i reuters-vectors/tfidf-matrix/matrix -o
reuters-vectors/lda -k 5 -ow -nt 41807 --maxIter 1 -dict
reuters-vectors/dictionary.file-0 -dt reuters-vectors/lda-topic -mt
reuters-vectors/lda-temp

Note:
- I setup --maxIter 1 because I just want to see if I am able to run this
algorithm, with the proper parameters.

The job get stuck and I don't know why, which means that the job doesn't
finish and at the same it's not using CPU. Here it is the log's tail:

13/02/08 19:41:08 INFO mapred.JobClient:  map 99% reduce 0%
13/02/08 19:41:08 INFO cvb.ModelTrainer: Initiating stopping of training
threadpool
13/02/08 19:41:08 INFO cvb.ModelTrainer: threadpool took: 1.163836ms
13/02/08 19:41:08 INFO cvb.ModelTrainer: writeModel.awaitTermination() took
173.872662ms
13/02/08 19:41:08 INFO mapred.Task: Task:attempt_local_0003_m_000000_0 is
done. And is in the process of commiting
13/02/08 19:41:08 INFO mapred.LocalJobRunner:
13/02/08 19:41:08 INFO mapred.Task: Task attempt_local_0003_m_000000_0 is
allowed to commit now
13/02/08 19:41:08 INFO output.FileOutputCommitter: Saved output of task
'attempt_local_0003_m_000000_0' to reuters-vectors/lda-topic
13/02/08 19:41:11 INFO mapred.LocalJobRunner:
13/02/08 19:41:11 INFO mapred.LocalJobRunner:
13/02/08 19:41:11 INFO mapred.Task: Task 'attempt_local_0003_m_000000_0'
done.
13/02/08 19:41:11 INFO mapred.JobClient:  map 100% reduce 0%
13/02/08 19:41:11 INFO mapred.JobClient: Job complete: job_local_0003
13/02/08 19:41:11 INFO mapred.JobClient: Counters: 12
13/02/08 19:41:11 INFO mapred.JobClient:   File Output Format Counters
13/02/08 19:41:11 INFO mapred.JobClient:     Bytes Written=1185853
13/02/08 19:41:11 INFO mapred.JobClient:   File Input Format Counters
13/02/08 19:41:11 INFO mapred.JobClient:     Bytes Read=15326617
13/02/08 19:41:11 INFO mapred.JobClient:   FileSystemCounters
13/02/08 19:41:11 INFO mapred.JobClient:     FILE_BYTES_READ=124124017
13/02/08 19:41:11 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=97148688
13/02/08 19:41:11 INFO mapred.JobClient:   Map-Reduce Framework
13/02/08 19:41:11 INFO mapred.JobClient:     Map input records=21578
13/02/08 19:41:11 INFO mapred.JobClient:     Physical memory (bytes)
snapshot=0
13/02/08 19:41:11 INFO mapred.JobClient:     Spilled Records=0
13/02/08 19:41:11 INFO mapred.JobClient:     Total committed heap usage
(bytes)=181207040
13/02/08 19:41:11 INFO mapred.JobClient:     CPU time spent (ms)=0
13/02/08 19:41:11 INFO mapred.JobClient:     Virtual memory (bytes)
snapshot=0
13/02/08 19:41:11 INFO mapred.JobClient:     SPLIT_RAW_BYTES=159
13/02/08 19:41:11 INFO mapred.JobClient:     Map output records=21578
13/02/08 19:41:11 INFO driver.MahoutDriver: Program took 190584 ms
(Minutes: 3.1764)

Any suggestion?

Thanks
Claudio

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message