nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: Following <form action> tags
Date Fri, 19 May 2006 18:24:52 GMT
Doug Cutting wrote:
> Andrzej Bialecki wrote:
>> I read through your email exchange, and setting aside all emotional 
>> content I think this is a valid request - indeed, as far as I can 
>> tell other major crawlers don't follow these links. We could either 
>> remove this, or make it optional (default not to use them).
>
> Is this as simple as deleting line 60 from DOMContentUtils.java (in 
> the html-parser plugin)?

Yes.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Mime
View raw message