manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paththamestrige Perera <prasad.srimal.per...@gmail.com>
Subject Question about using ManifolfCF Repository Connectors
Date Tue, 15 Jul 2014 21:06:01 GMT
Hello All,

I'm new to Apache ManifoldCF and I have spent sometime referring the
publication 'ManifoldCF in Action' as well. I have started using the
ManifoldCF system with the available repository connectors, CMIS Repository
Connector, Alfresco Repository Connector and File System Connector.

I have used them as continuous crawlers with specific re-crawl intervals.
What I have noticed is that, irrelevant to the Document version (whether it
has changed or not), in all re-crawl jobs, CMIS and Alfresco connectors
process all seeded documents. I took a look at their implementations and as
I could see, these repository connectors does not use the property
'scanOnly' at the processing time of seeded documents which hints if the
document version has changed. It seems intentional by design. So I'm hoping
to know why is it necessary to process all seeded documents (oppose to only
process documents that were updated within the re-crawling interval) ?

Thanks!

Prasad Perera.

Mime
View raw message