nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Meusel (JIRA)" <j...@apache.org>
Subject [jira] [Created] (NUTCH-2202) Integration of Anthelion (Focused Crawling Module) into Nutch
Date Tue, 19 Jan 2016 09:08:39 GMT
Robert Meusel created NUTCH-2202:
------------------------------------

             Summary: Integration of Anthelion (Focused Crawling Module) into Nutch
                 Key: NUTCH-2202
                 URL: https://issues.apache.org/jira/browse/NUTCH-2202
             Project: Nutch
          Issue Type: Improvement
          Components: parser, scoring
            Reporter: Robert Meusel


We have recently released anthelion, which is a focused crawler plugin for structured data
which can be extracted with any23. (https://github.com/yahoo/anthelion) As proposed by Lewis
(Lewis John McGibbney) we think the integration of the parser (any23) and the scoring funciton
based on the online learner could be a good improvement for nutch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message