nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject tools cleanup
Date Wed, 30 Mar 2005 20:53:24 GMT
I propose we cleanup Nutch's tools as follows.

First, some definitions:

1. An "action" is an operation on Nutch data.  For example, 
GenerateSegmentFromDB, FetchSegment, UpdateDB, IndexSegment, 
MergeIndexes, SearchServer, etc. are all actions.

2. A "tool" invokes an action from the command line.

The proposal:

1. Actions and tools should be separate classes, in separate files.

2. A tool class should define no methods other than a main() and perhaps 
those required to parse the command line.  All application logic should 
be in the action class.

3. All actions must implement the following interface:

   public interface NutchConfigurable {
     void setConf(NutchConf conf);
     NutchConf getConf();

4. Most actions should implement this by extending:

   public class NutchConfigured implements NutchConfigurable {
     private NutchConf conf;
     public NutchConfigured(NutchConf conf) { setConf(conf); }
     public void setConf(NutchConf conf) { this.conf = conf; }
     public NutchConf getConf() { return conf; }

5. All plugins must implement NutchConfigurable.

6. Plugin factory methods must accept a NutchConf.

For example:

   public static Protocol ProtocolFactory.getProtocol(String url);

will become:

   public static Protocol ProtocolFactory.getProtocol(NutchConf, String);



View raw message