manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Http status code 302
Date Wed, 09 Jan 2013 08:49:55 GMT
When I try the URL you gave using curl and no special arguments, I get this:


C:\Users\Karl>curl -vvv "http://lucene.jugem.jp/?eid=39"
* About to connect() to lucene.jugem.jp port 80 (#0)
*   Trying 210.172.160.170... connected
* Connected to lucene.jugem.jp (210.172.160.170) port 80 (#0)
> GET /?eid=39 HTTP/1.1
> User-Agent: curl/7.21.7 (i386-pc-win32) libcurl/7.21.7 OpenSSL/1.0.0c zlib/1.2
.5 librtmp/2.3
> Host: lucene.jugem.jp
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Wed, 09 Jan 2013 08:47:52 GMT
< Server: Apache/2.0.59 (Unix)
< Vary: User-Agent,Host,Accept-Encoding
< Last-Modified: Tue, 08 Jan 2013 07:58:33 GMT
< Accept-Ranges: bytes
< Content-Length: 22594
< Cache-Control: private
< Pragma: no-cache
< Connection: close
< Content-Type: text/html

There's no 302 from here.

Are you trying to crawl through a proxy?  If so, that might be where
the problem lies.

Karl

On Wed, Jan 9, 2013 at 3:40 AM, Karl Wright <daddywri@gmail.com> wrote:
> It sounds like the httpclient upgrade definitely broke something.  We
> should open a ticket.
>
> But first, can you confirm what connector this is?  Is it the web
> connector?  If so, I am puzzled because the web connector has always
> logged any 302 return, but then queued a second document which it
> subsequently fetches.
>
> Karl
>
> On Wed, Jan 9, 2013 at 2:10 AM, Shinichiro Abe
> <shinichiro.abe.1@gmail.com> wrote:
>> Hi,
>>
>> I'm using trunk code and crawling web site with seeds which have http://lucene.jugem.jp/?eid=39
(koji's blog --I don't obey robots.txt).
>> As I'm look at Simple History, it shows 302 result code at fetch activity and doesn't
ingest document.
>>
>> When I used MCF 1.0.1 in the same situation, Simple History showed 200 result code
and MCF could ingest documents.
>>
>> Why does the trunk shows 302 status? Is it relevant to upgrading httpclient?
>>
>> Thanks in advance,
>> Shinichiro Abe

Mime
View raw message