nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] [Created] (NUTCH-1927) Create a whitelist of IPs/hostnames to allow skipping of RobotRules parsing
Date Thu, 29 Jan 2015 20:56:35 GMT
Chris A. Mattmann created NUTCH-1927:
----------------------------------------

             Summary: Create a whitelist of IPs/hostnames to allow skipping of RobotRules
parsing
                 Key: NUTCH-1927
                 URL: https://issues.apache.org/jira/browse/NUTCH-1927
             Project: Nutch
          Issue Type: Bug
            Reporter: Chris A. Mattmann


Based on discussion on the dev list, to use Nutch for some security research valid use cases
(DDoS; DNS and other testing), I am going to create a patch that allows a whitelist:

{code:xml}
<property>
  <name>robot.rules.whitelist</name>
  <value>132.54.99.22,hostname.apache.org,foo.jpl.nasa.gov</value>
  <description>Comma separated list of hostnames or IP addresses to ignore robot rules
parsing for.
  </description>
</property>
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message