manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <>
Subject RE: Timeout problems with web crawling
Date Tue, 23 Apr 2013 12:18:00 GMT
Do you have the ability to use wireshark or tcpdump on this machine? If
so, can you set up a crawl with only that URL, and compare and contrast
fetches vs. Curl? There must be some key difference.


Sent from my Windows Phone
From: Erlend Garåsen
Sent: 4/23/2013 8:03 AM
Subject: Re: Timeout problems with web crawling
On 23.04.13 13.48, Erlend Garåsen wrote:

> -bash-3.2$ curl -vvv -H "User-Agent: Mozilla/5.0
> (ApacheManifoldCFWebCrawler;"
> "|1366644879398+299979"

A small typo in the URL, so the correct command is:
curl -vvv -H "User-Agent: Mozilla/5.0 (ApacheManifoldCFWebCrawler;"

But same result. An immediate response.


Erlend Garåsen
Center for Information Technology Services
University of Oslo
P.O. Box 1086 Blindern, N-0317 OSLO, Norway
Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050

View raw message