manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shigeki Kobayashi <shigeki.kobayas...@g.softbank.co.jp>
Subject "start minimal" option even deletes contents whose links are deleted
Date Wed, 24 Dec 2014 02:43:07 GMT
Hello guys.




I like to clarify how “minimal start” of job execution work in MCF, using
web contents as a repository connection and Solr as an output connection.



I thought it supposed to skip deletion and it only crawls changed or new
documents. However, there is a guy in my team who tested the minimal start
option and he found out even deletion was done.



To be more specific of how the crawled content changed, he deleted one link
from the root page after full crawling. So the next minimal crawl should
avoid deleting the deleted content from the index.



He used an older version of MCF 1.4.1 but it seems there isn’t any change
list about this issue so I suppose the same thing could happen in the
latest version.



How are we supposed to run MCF without index deletion?



Regards,



Shigeki

Mime
View raw message