Hi Ronny,

The amount of work that is needed for a recrawl is highly connector dependent.  Unfortunately the file system connector is one of the worst because directory entries have to all be rescanned.  A minimal crawl will cut down on this a lot but won't pick up deletions, so any schedule you come up with should have full crawls once in a while.


Sent from my Windows Phone

From: Ronny Heylen
Sent: 11/7/2013 4:34 PM
To: user@manifoldcf.apache.org
Subject: Manifoldcf is "slow"

A job is indexing all *.doc* from a shared windows network drive.
That makes 245113 documents.
The job has run last night.
This job has run again tonight and ended successfully in 2 hours and 20 minutes.
But, from these 245113 documents only 30 were modified today.
How is it possible that 140 minutes were necessary to reindex them?
Is it because Manifoldcf recheck the permissions for all documents from the AD?
Or something else?
Can we speed things up by using "Start minimal"? (What exactly does "start minimal" mean is a little bit mysterious for us), in that case should we use "start" once a week to be up to date?
Thanks for the help,