nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Nutch Wiki] Update of "Nutch2Architecture" by DennisKubes
Date Thu, 24 Apr 2008 19:42:29 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The following page has been changed by DennisKubes:
http://wiki.apache.org/nutch/Nutch2Architecture

The comment on the change is:
Changed DI for configuration and reflection utils

------------------------------------------------------------------------------
  == Overview ==
    * Reuse of existing Nutch codebase
      * While some things will change this architecture is more of a refactor than a complete
re-write.  Much of the existing codebase including plugin functionality should be reused.
-   * Dependency Injection
-     * Remove the plugin framework and use a DI framework, Spring for example, to create
mapper and reducer classes that are auto injected with dependencies.  This will take modifications
to the Hadoop codebase.
+   * Remove the plugin framework
+     * After some experimenting, DI using spring or another similar framework presents problems.
 Good news is that we can achieve the same thing using the configuration objects from hadoop
along with creating new instances using reflectionutils.  This is more service locator than
dependency injection but it still gives us the same benefits.
+     * Have the ability to change the jobconfiguration settings for tools.  This can be accomplished
through some type of properties file on the classpath and would be useful for testing, for
example the ability to switch out an outputformat to see the output in text format.
      * Have mock objects that make it easy to test jobs.
  
  == Data Structures ==

Mime
View raw message