manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erlend GarĂ¥sen <e.f.gara...@usit.uio.no>
Subject Re: Hop count problem
Date Mon, 12 Aug 2013 14:53:39 GMT
On 8/12/13 4:29 PM, Karl Wright wrote:
> Hi Erlend,
>
> The Document Status report shows these documents because they are still
> in the queue.  The reasons for this could be several.  Documents that
> exceed the hopcount by 1 level are allowed to remain in the queue for
> bookkeeping purposes.  "scheduled date" as given only meaningful if the
> document is in an active state; my guess is that these documents are not
> in fact in that state, but rather in the state HOPCOUNT_EXCEEDED.  Can
> you include one complete row from the Document Status report for one of
> the missing documents?

For "http://www.ibsen.uio.no/sakprosa.xhtml":
Job: Ibsen
State: Out of scope
Status: Hopcount exceeded
Scheduled: 01-01-1970 01:00:00.000
Scheduled action: Process
Retry count: N/A
Retry limit: N/A

> When you added documents to the seed list, what did the Simple History
> say when they were fetched?  If they don't appear in the simple history,
> they SHOULD have nevertheless appeared in the log, with an explanation
> of why they were excluded, provided you have connector debugging enabled.

OK, here is the seed list:
http://www.ibsen.uio.no/
http://www.ibsen.uio.no/skuespill.xhtml
http://www.ibsen.uio.no/dikt.xhtml
http://www.ibsen.uio.no/brev.xhtml
http://www.ibsen.uio.no/sakprosa.xhtml
http://www.ibsen.uio.no/varia.xhtml
http://www.ibsen.uio.no/undervisningsressurser.xhtml

Here is the results from simple history:
08-12-2013 16:46:26.536 	job end 	1368534065016(Ibsen)
		0 	1 	
08-12-2013 16:46:09.927 	document ingest (Solr) 
http://www.ibsen.uio.no/forside.xhtml
	OK 	11897 	178 	
08-12-2013 16:46:09.751 	fetch 	http://www.ibsen.uio.no/forside.xhtml
	200 	11897 	17 	
08-12-2013 16:44:48.829 	fetch 	http://www.ibsen.uio.no/
	302 	0 	79484 	
08-12-2013 16:44:48.727 	robots parse 	www.ibsen.uio.no:80
	HTML 	0 	2 	Robots file contained HTML, skipped
08-12-2013 16:44:46.574 	job start 	1368534065016(Ibsen)
		0 	1
	1

HttpClient log:
http://folk.uio.no/erlendfg/manifoldcf/manifoldcf.log

Erlend


Mime
View raw message