nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Nutch Wiki] Trivial Update of "Nutch2Tutorial" by LewisJohnMcgibbney
Date Wed, 13 Jun 2012 16:06:05 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "Nutch2Tutorial" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/Nutch2Tutorial?action=diff&rev1=3&rev2=4

  This document describes how to get Nutch 2.0 to use HBase as a storage backend for Gora.
  
   * Grab a distribution of Nutch 2.X from [[http://www.apache.org/dyn/closer.cgi/nutch/|here]]
-  * Install and configure HBase. You can get it [[http://www.apache.org/dyn/closer.cgi/hbase/|here]]
('''N.B.''' Gora 0.2 uses HBase 0.90.4, however the setup is know to work with more recent
versions of HBase.)
+  * Install and configure HBase. You can get it [[http://www.apache.org/dyn/closer.cgi/hbase/|here]]
('''N.B.''' Gora 0.2 uses HBase 0.90.4, however the setup is known to work with more recent
versions of HBase.)
   * Specify the GORA backend in nutch-site.xml
  
  {{{
@@ -44, +44 @@

  
  You should find more details in the logs on ''$NUTCH_HOME/runtime/local/logs/hadoop.log''.
  
+ '''N.B.''' The process of using the other datastore implementations offered within Gora
e.g. Apache Cassandra, Accumulo and Sql, can be achieved simply by tweaking the above settings
prior to compiling the Nutch code.
+ 
  For more details of the command line interface options, please see [[http://wiki.apache.org/nutch/CommandLineOptions|here]],
or of course run ./bin/nutch which will print usage to std out.
  Finally, for a more detailed Nutch (1.X) tutorial, please see [[http://wiki.apache.org/nutch/NutchTutorial|here]]
  

Mime
View raw message