manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Crawling behind an ISA proxy (iis 7.5) revisited
Date Mon, 18 Jun 2012 21:38:29 GMT
Posted two patches for CONNECTORS-482 also, if you want enhanced debugging.

Karl

On Mon, Jun 18, 2012 at 5:24 PM, Karl Wright <daddywri@gmail.com> wrote:
> Please add the patch for CONNECTORS-483.  This adds the NT proxy
> feature, ported from the RSS connector.
>
> Karl
>
> On Mon, Jun 18, 2012 at 2:10 PM, Jan van Haarst <jan@vanhaarst.net> wrote:
>> OK, we'll do.
>>
>> On Mon, Jun 18, 2012 at 3:18 PM, Karl Wright <daddywri@gmail.com> wrote:
>>>
>>> I'll be committing any changes to trunk.  I'm happy to also include a
>>> patch, which should work with 0.5-incubating, but you'll need to build
>>> it, of course, with the patch in place.
>>>
>>> Karl
>>>
>>> On Mon, Jun 18, 2012 at 9:09 AM, Jan van Haarst <jan@vanhaarst.net> wrote:
>>> > Hello Karl,
>>> >
>>> > The version we have running is ManifoldCF 0.5-incubating.
>>> > It would be great to be able to get to the bottom of this.
>>> >
>>> > Dag,
>>> > Jan
>>> >
>>> > On Mon, Jun 18, 2012 at 2:21 PM, Karl Wright <daddywri@gmail.com>
wrote:
>>> >>
>>> >> HTTPClient 3.1 itself does not seem to provide a logging option for
>>> >> logging the body.  However, it should be straightforward to add this
>>> >> to the ManifoldCF code.  What version are you running, so that I can
>>> >> provide the appropriate patch?
>>> >>
>>> >> Karl
>>> >>
>>> >>
>>> >>
>>> >> On Mon, Jun 18, 2012 at 8:09 AM, Jan van Haarst <jan@vanhaarst.net>
>>> >> wrote:
>>> >> > Hello all,
>>> >> >
>>> >> > I'm a colleague of the original poster [1].
>>> >> >
>>> >> > We got a lot further in figuring out the flow of the website,
and
>>> >> > thus
>>> >> > the
>>> >> > way ManifoldCF should crawl it.
>>> >> > In that process, we discovered that our problem might lie with
>>> >> > httpclient ,as the server responds with a 401.2 response, because
the
>>> >> > client
>>> >> > doesn't send authentication headers, as mentioned in [2].
>>> >> >
>>> >> > My question is this :
>>> >> > Is the raw response of the server stored somewhere in case of a
401
>>> >> > return
>>> >> > code ?
>>> >> > If so, I can check whether my idea is right, and after that try
to
>>> >> >  fix
>>> >> > it.
>>> >> >
>>> >> > With kind regards,
>>> >> >
>>> >> > Jan van Haarst
>>> >> >
>>> >> > [1]
>>> >> >
>>> >> >
>>> >> > http://mail-archives.apache.org/mod_mbox/incubator-connectors-user/201205.mbox/%3CCAFxWV0WY_Vojsshbfr0PSs%3DG-Xpd1wUJXFcbVVsOvntbXs1zRg%40mail.gmail.com%3E
>>> >> >
>>> >> >
>>> >> > [2] http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/8feeaa51-c634-4de3-bfdc-e922d195a45e.mspx?mfr=true
>>> >> >
>>> >> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Dag,
>>> > Jan
>>
>>
>>
>>
>> --
>> Dag,
>> Jan

Mime
View raw message