manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject RE: Web crawl not completing
Date Sat, 09 Nov 2013 07:23:51 GMT
Hi Mark,

Transient failures of this kind will retry for a while but will then either
abort the job or skip the documents in question.  But it may be many hours
before this happens.  How long have you waited?

Karl

Sent from my Windows Phone
------------------------------
From: Mark Libucha
Sent: 11/8/2013 3:46 PM
To: user@manifoldcf.apache.org
Subject: Web crawl not completing

My web crawl does not complete. The UI gets stuck at the end, showing that
4 documents are still Active.

In my logs, 4 URIs show warnings, like this:

 WARN 2013-11-08 15:09:04,439 (Worker thread '48') - Pre-ingest service
interruption reported for job 1383941193567 connection 'web': Timed out
waiting for response for 'http://myhost/somefile': The target server failed
to respond

Not always 4 files, some times more, sometimes less, and not always the
same files.

My output connector never gets the job completed callback.

Any suggestions?

Thanks,

Mark

Mime
View raw message