nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] Created: (NUTCH-140) Add alias capability in parse-plugins.xml file that allows mimeType->extensionId mapping
Date Wed, 14 Dec 2005 04:10:46 GMT
Add alias capability in parse-plugins.xml file that allows mimeType->extensionId mapping
----------------------------------------------------------------------------------------

         Key: NUTCH-140
         URL: http://issues.apache.org/jira/browse/NUTCH-140
     Project: Nutch
        Type: Improvement
  Components: fetcher  
 Environment:  Power Mac OS X 10.4, Dual Processor G5 2.0 Ghz, 1.5 GB RAM, although bug is
independent of environment
    Reporter: Chris A. Mattmann
 Assigned to: Chris A. Mattmann 
    Priority: Minor


 Jerome and I have been talking about an idea to address the current issue raised by Stefan
G. about having a mapping of mimeType->list of pluginIds rather than mimeType->list
of extensionIds in the parse-plugins.xml file. We've come up with the following proposed update
that would seemingly fix this problem.

  We propose to have the concept of "aliases" in the parse-plugins.xml file, defined at the
end of the file, something lie:

 <parse-plugins>
    ....

   <mimeType name="text/html">
      <plugin id="parse-html"/>
   </mimeType>

    .....
  
   <aliases>
   <alias name="parse-html"
extension-point="org.apache.nutch.parse.html.HtmlParser"/>

   ....
   <alias name="parse-html2" extension-point="my.other.html.Parser"/>
   
   ....
   </aliases>
</parse-plugins>



What do you guys think? This approach would be flexible enough to allow the mapping of extensionIds
to mimeTypes, but without impacting the current "pluginId" concept.

Comments welcome. 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message