mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Hammer <adam.ham...@gmail.com>
Subject "Not a file" issue with TwentyNewsGroups
Date Mon, 05 Apr 2010 13:59:51 GMT
Hello all,

I am just starting out with Mahout, and to get my feet wet I am running
through the TwentyNewsGroups example.  I have successfully configured a
single node Hadoop system as well as a pseudo-distributed Hadoop system on
two separate machines.  On both environments, I have gone through the guide
successfully to put all the news inputs into the folder 20news-input.  I am
able to successfully ls and cat the files in the directory.

However, when I go to run the TrainClassifier, I am getting the following
message:

10/04/05 09:48:33 INFO bayes.TrainClassifier: Training Complementary Bayes
Classifier
10/04/05 09:48:33 INFO cbayes.CBayesDriver: Reading features...
10/04/05 09:48:33 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
10/04/05 09:48:33 INFO mapred.FileInputFormat: Total input paths to process
: 19
Exception in thread "main" java.io.IOException: Not a file:
hdfs://localhost:9000/user/bob/20news-input/comp.graphics
    at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:206)
    at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
    at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
    at
org.apache.mahout.classifier.bayes.mapreduce.common.BayesFeatureDriver.runJob(BayesFeatureDriver.java:75)
    at
org.apache.mahout.classifier.bayes.mapreduce.cbayes.CBayesDriver.runJob(CBayesDriver.java:61)
    at
org.apache.mahout.classifier.bayes.TrainClassifier.trainCNaiveBayes(TrainClassifier.java:56)
    at
org.apache.mahout.classifier.bayes.TrainClassifier.main(TrainClassifier.java:128)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

I get this error on both the single node system I have setup, as well as the
separate dual-node system.  As I said before, I am able to cat and ls that
directory and the files in it perfectly fine.  Any thoughts?

Thanks!

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message