nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: Mailing List nutch-agent Reports of Bots Submitting Forms
Date Wed, 24 May 2006 20:13:08 GMT
Jeremy Bensley wrote:
> There are posts every three or four days to the nutch-agent regarding 
> bots
> submitting empty forms to websites. I don't think I've seen any 
> regular devs
> reply in-list to these issues, and am just wondering if these cases are
> being analyzed.
>
> 1. Is there a known (resolved or current) bug regarding Nutch submitting
> forms? I could find no bug listings in JIRA for this.  If it is known and
> resolved, what versions of the bot exhibit this behavior?

Yes, there was a discussion on the list about this - I'm afraid this 
behavior is present in both 0.7.x and 0.8. I'm going to remove the 
offending code (or to put it as an option, turned off by default).

>
> 2. Are the Nutch Devs replying to the emails sent to this list? I could
> understand if they are replying off-list, but to an outside observer 
> such as
> myself it appears as though webmasters are not getting many replies to 
> their
> inqueries.

I can speak for myself only .. I'm not tracking that list. What about 
others?

>
> I don't mean to be alarmist, but I think it is in the community's best
> interests to make sure that these kinds of complaints get resolved 
> such that
> nutch is a good 'citizen' and isn't blacklisted from searching sites.
Of course you are right, there is no ill will here on our part, just a long queue of issues
to address ... but it seems we have to prioritize this one.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Mime
View raw message