nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jorge Conejero Jarque" <jconej...@gpm.es>
Subject Crawler Data
Date Tue, 27 May 2008 11:46:54 GMT
I would like to make an application using the API Nutch, could extract data from the pages
before being indexed in the index and be able to do them in some kind of modification or processing,
because it can become something useful and interesting. 

The problem is that I can not find information on how to use the API part of the Crawl Nutch.


Only find exercises that are executed by the console with Cygwin and that only explain aspects
of configuration and creation of an index.

If you could help me, with some examples.
Thanks.

Un saludo.

Jorge Conejero Jarque
Dpto. Java Technology Group
GPM Factoría Internet
923100300
http://www.gpm.es 

 

Mime
View raw message