nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Pavel (JIRA)" <j...@apache.org>
Subject [jira] Created: (NUTCH-966) Behavior of NOINDEX,FOLLOW is not intuitive
Date Wed, 09 Feb 2011 14:32:57 GMT
Behavior of NOINDEX,FOLLOW is not intuitive
-------------------------------------------

                 Key: NUTCH-966
                 URL: https://issues.apache.org/jira/browse/NUTCH-966
             Project: Nutch
          Issue Type: Improvement
          Components: indexer, parser
    Affects Versions: 1.2
            Reporter: Josh Pavel
            Priority: Minor


If a page has NOINDEX,FOLLOW for the ROBOTS metatag, Nutch will still create a document that
can be found in the index via metatag or URL matching.  Instead, Nutch should rely on doc
or parse metadata but nothing should be stored by the html parser. (thanks to Julien Nioche
for helping me to understand the issue). 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message