manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jim switzer <mojosw...@gmail.com>
Subject Web Repository Deletion Policy
Date Thu, 29 Aug 2013 15:41:51 GMT
How does the web repository connector decide when to delete documents?

I ran a job yesterday, and crawled/processed ~250 documents.  This
morning, I did a 'start minimal' on the job, and it proceeded to
delete all the documents it crawled yesterday.  The site appears to
have been experiencing issues when I restarted the job, but I was
surprised to see so much content deleted after one failed job run.

Here is the 'Simple History' from the job:

<lots more document deletion messages>
08-29-2013 08:15:18.407 document deletion (LocalFiles)
http://beta.blah.com:42541/hr/Pages/Reward-Recog...nition.aspx OK 0 1
08-29-2013 08:15:18.406 document deletion (LocalFiles)
http://beta.blah.com:42541/IT/Documents/it_polic...yprocedure_internet_intranet_policy_100112.pdf
OK 0 1
08-29-2013 08:15:16.268 fetch http://beta.blah.com:42541/_layouts/ 403 0 44
08-29-2013 08:15:11.267 fetch http://beta.blah.com:42541/ 302 178 149
08-29-2013 08:15:11.001 document deletion (LocalFiles)
http://beta.blah.com:42541/Pages/Home.aspx OK 0 1
08-29-2013 08:12:28.911 fetch
http://beta.blah.com:42541/Pages/Home.aspx 200 3376 162081
08-29-2013 08:12:27.365 job start 1377726689367(BetaCrawl) 0 1

Mime
View raw message