nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Nutch Wiki] Update of "RunNutchInEclipse0.9" by BartoszGadzimski
Date Tue, 14 Apr 2009 08:21:21 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The following page has been changed by BartoszGadzimski:
http://wiki.apache.org/nutch/RunNutchInEclipse0%2e9

The comment on the change is:
Added java heap size solution

------------------------------------------------------------------------------
- = RunNutchInEclipse =
+ = Run Nutch In Eclipse on Linux and Windows nutch version 0.9=
  
  This is a work in progress. If you find errors or would like to improve this page, just
create an account [UserPreferences] and start editing this page :-)
  
@@ -104, +104 @@

   * click on "Run"
   * if all works, you should see Nutch getting busy at crawling :-)
  
- == Debug Nutch in Eclipse (not yet tested for 0.9) ==
+ == Java Heap Size problem ==
+ 
+ If you find in hadoop.log line similar to this:
+ 
+ {{{
+ 2009-04-13 13:41:06,105 WARN  mapred.LocalJobRunner - job_local_0001
+ java.lang.OutOfMemoryError: Java heap space
+ }}}
+ 
+ You should increase amount of RAM for running applications from eclipse.
+ 
+ Just set it in:
+ 
+ Eclipse -> Window -> Preferences -> Java -> Installed JREs -> edit ->
Default VM arguments
+ 
+ I've set mine to 
+ {{{
+ -Xms5m -Xmx150m 
+ }}}
+ because I have like 200MB RAM left after runnig all apps
+ 
+ -Xms (minimum ammount of RAM memory for running applications)
+ -Xmx (maximum) 
+ 
+ 
+ == Debug Nutch in Eclipse  ==
   * Set breakpoints and debug a crawl
   * It can be tricky to find out where to set the breakpoint, because of the Hadoop jobs.
Here are a few good places to set breakpoints:
  {{{

Mime
View raw message