nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Nutch Wiki] Trivial Update of "NutchHadoopTutorial" by ilgiz
Date Wed, 18 Nov 2009 17:24:16 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "NutchHadoopTutorial" page has been changed by ilgiz.
http://wiki.apache.org/nutch/NutchHadoopTutorial?action=diff&rev1=15&rev2=16

--------------------------------------------------

  
  ----
  
- * By default Nutch will read only the first 100 links on a page.  This will result in incomplete
indexes when scanning file trees.  So I set the "max outlinks per page" option to -1 in nutch-site.conf
and got complete indexes.
+   * By default Nutch will read only the first 100 links on a page.  This will result in
incomplete indexes when scanning file trees.  So I set the "max outlinks per page" option
to -1 in nutch-site.conf and got complete indexes.
  {{{
  <property>
    <name>db.max.outlinks.per.page</name>

Mime
View raw message