mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: "Not a file" issue with TwentyNewsGroups
Date Tue, 06 Apr 2010 21:59:42 GMT
What are the commands you are running?

On Apr 5, 2010, at 9:59 AM, Adam Hammer wrote:

> Hello all,
> 
> I am just starting out with Mahout, and to get my feet wet I am running
> through the TwentyNewsGroups example.  I have successfully configured a
> single node Hadoop system as well as a pseudo-distributed Hadoop system on
> two separate machines.  On both environments, I have gone through the guide
> successfully to put all the news inputs into the folder 20news-input.  I am
> able to successfully ls and cat the files in the directory.
> 
> However, when I go to run the TrainClassifier, I am getting the following
> message:
> 
> 10/04/05 09:48:33 INFO bayes.TrainClassifier: Training Complementary Bayes
> Classifier
> 10/04/05 09:48:33 INFO cbayes.CBayesDriver: Reading features...
> 10/04/05 09:48:33 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 10/04/05 09:48:33 INFO mapred.FileInputFormat: Total input paths to process
> : 19
> Exception in thread "main" java.io.IOException: Not a file:
> hdfs://localhost:9000/user/bob/20news-input/comp.graphics
>    at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:206)
>    at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
>    at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
>    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
>    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
>    at
> org.apache.mahout.classifier.bayes.mapreduce.common.BayesFeatureDriver.runJob(BayesFeatureDriver.java:75)
>    at
> org.apache.mahout.classifier.bayes.mapreduce.cbayes.CBayesDriver.runJob(CBayesDriver.java:61)
>    at
> org.apache.mahout.classifier.bayes.TrainClassifier.trainCNaiveBayes(TrainClassifier.java:56)
>    at
> org.apache.mahout.classifier.bayes.TrainClassifier.main(TrainClassifier.java:128)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> 
> I get this error on both the single node system I have setup, as well as the
> separate dual-node system.  As I said before, I am able to cat and ls that
> directory and the files in it perfectly fine.  Any thoughts?
> 
> Thanks!


Mime
View raw message