nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doğacan Güney (JIRA) <>
Subject [jira] Commented: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server
Date Wed, 03 Oct 2007 07:42:50 GMT


Doğacan Güney commented on NUTCH-559:

I haven't tested it yet but after a quick review, latest patch looks good to me. However,
it would be nice if we can have some unit tests for the new functionality.

> Extending the authentication to work for more than one host was in my mind but I found
too many possible cases. So I was 
> planning to have a different configuration file where all the authentication rules can
be mentioned to override the corresponding 
> 'conf/nutch-site.xml' properties. The different possible cases are: [...]

OK, a different configuration file sounds good (I don't like that we are putting a file in
conf/ for a plugin, but we already do that anyway. We should probably prefix the name of the
file with plugin's name to make it clear, like: httpclient-auth.txt)

> I removed cookie related code earlier because I didn't find it to work (even before merging
my work). However, I have brought
> them back in the revised patch. We can discuss more on this if required.

I think it should work. It doesn't remember cookies across different crawl cycles but it should
remember them during a single fetch.

> I have restored most of the original response reading code except for 'calculateTryToRead'.
This method is not checking for 
> 'Content-Length' limit. The content-length limit check present in this patch is similar
to that of 'protocol-http' which is simpler 
> and correct.


> NTLM, Basic and Digest Authentication schemes for web/proxy server
> ------------------------------------------------------------------
>                 Key: NUTCH-559
>                 URL:
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>    Affects Versions: 1.0.0
>            Reporter: Susam Pal
>         Attachments: NUTCH-559v0.1.patch, NUTCH-559v0.2.patch
> Added basic, digest and NTLM authentication schemes to protocol-httpclient. The authentication
schemes can be configured for proxy server as well as web servers of a domain. HTTP authentication
can take place over HTTP/1.0, HTTP/1.1 and HTTPS.
> The authentication guide can be found here: [].

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message