nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <kkrugler_li...@transpac.com>
Subject Re: IOException in dedup
Date Tue, 02 Jun 2009 16:41:32 GMT
>Hello,
>
>I am new with Nutch and I have set up Nutch 0.9 on Easy Eclipse for 
>Mac OS X. When I try to start crawling I get the following exception:
>
>Dedup: starting
>Dedup: adding indexes in: crawl/indexes
>Exception in thread "main" java.io.IOException: Job failed!
>	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
>	at 
>org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:439)
>	at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
>
>
>Does anyone know how to solve this problem? 

You can get an IOException reported by Hadoop when the root cause is 
that you've run out of memory. Normally the hadoop.log file would 
have the OOM exception.

If you're running from inside of Eclipse, see 
http://wiki.apache.org/nutch/RunNutchInEclipse0.9 for more details.

-- Ken
-- 
Ken Krugler
+1 530-210-6378
Mime
View raw message