nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Nutch Wiki] Update of "RunNutchInEclipse1.0" by FrankMcCown
Date Thu, 16 Apr 2009 19:32:57 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The following page has been changed by FrankMcCown:
http://wiki.apache.org/nutch/RunNutchInEclipse1%2e0

The comment on the change is:
Moved heap problem to location of other problems

------------------------------------------------------------------------------
   * if all works, you should see Nutch getting busy at crawling :-)
  
  
- == Java Heap Size problem ==
- 
- If you find in hadoop.log line similar to this:
- 
- {{{
- 2009-04-13 13:41:06,105 WARN  mapred.LocalJobRunner - job_local_0001
- java.lang.OutOfMemoryError: Java heap space
- }}}
- 
- You should increase amount of RAM for running applications from eclipse.
- 
- Just set it in:
- 
- Eclipse -> Window -> Preferences -> Java -> Installed JREs -> edit ->
Default VM arguments
- 
- I've set mine to 
- {{{
- -Xms5m -Xmx150m 
- }}}
- because I have like 200MB RAM left after runnig all apps
- 
- -Xms (minimum ammount of RAM memory for running applications)
- -Xmx (maximum) 
- 
  == Debug Nutch in Eclipse (not yet tested for 0.9) ==
   * Set breakpoints and debug a crawl
   * It can be tricky to find out where to set the breakpoint, because of the Hadoop jobs.
Here are a few good places to set breakpoints:
@@ -195, +171 @@

  == If things do not work... ==
  Yes, Nutch and Eclipse can be a difficult companionship sometimes ;-)
  
+ === Java Heap Size problem ===
+ 
+ If the crawler throws an IOException exception early in the crawl (Exception in thread "main"
java.io.IOException: Job failed!), check the logs/hadoop.log file for further information.
If you find in hadoop.log lines similar to this:
+ 
+ {{{
+ 2009-04-13 13:41:06,105 WARN  mapred.LocalJobRunner - job_local_0001
+ java.lang.OutOfMemoryError: Java heap space
+ }}}
+ 
+ then you should increase amount of RAM for running applications from Eclipse.
+ 
+ Just set it in:
+ 
+ Eclipse -> Window -> Preferences -> Java -> Installed JREs -> edit ->
Default VM arguments
+ 
+ I've set mine to 
+ {{{
+ -Xms5m -Xmx150m 
+ }}}
+ because I have like 200MB RAM left after running all apps
+ 
+ -Xms (minimum ammount of RAM memory for running applications)
+ -Xmx (maximum) 
+ 
- === eclipse: Cannot create project content in workspace ===
+ === Eclipse: Cannot create project content in workspace ===
  The nutch source code must be out of the workspace folder. My first attempt was download
the code with eclipse (svn) under my workspace. When I try to create the project using existing
code, eclipse don't let me do it from source code into the workspace. I use the source code
out of my workspace and it work fine.
  
  === plugin dir not found ===

Mime
View raw message