Hi Prasad,

Since the CMIS and Alfresco connectors do not pay attention to the scanOnly flag, they are not correctly written and should be fixed.  Could you create a ticket to address this?


On Tue, Jul 15, 2014 at 5:06 PM, Paththamestrige Perera <prasad.srimal.perera@gmail.com> wrote:
Hello All,

I'm new to Apache ManifoldCF and I have spent sometime referring the publication 'ManifoldCF in Action' as well. I have started using the ManifoldCF system with the available repository connectors, CMIS Repository Connector, Alfresco Repository Connector and File System Connector.

I have used them as continuous crawlers with specific re-crawl intervals. What I have noticed is that, irrelevant to the Document version (whether it has changed or not), in all re-crawl jobs, CMIS and Alfresco connectors process all seeded documents. I took a look at their implementations and as I could see, these repository connectors does not use the property 'scanOnly' at the processing time of seeded documents which hints if the document version has changed. It seems intentional by design. So I'm hoping to know why is it necessary to process all seeded documents (oppose to only process documents that were updated within the re-crawling interval) ?


Prasad Perera.