nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sami Siren <>
Subject Re: [jira] Commented: (NUTCH-293) support for Crawl-delay in Robots.txt
Date Wed, 19 Jul 2006 21:22:47 GMT
Andrzej Bialecki (JIRA) wrote:

>    [ ] 
>Andrzej Bialecki  commented on NUTCH-293:
>I'm working on this patch to commit it. Just a quick note to Sami: Math.max() is not optimal,
because it always picks up the longest wait period. We are interested in getting a right period
- it may be longer, but it may also be shorter than the serverDelay. If it's shorter then
we win, because we are allowed to crawl this site faster.
I quess it depends on the angle you look at it :)
"don't be polite, just as polite as it's required"

I'm ok with the original logic.

 Sami Siren

View raw message