manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marisol Redondo <marisol.redondo.gar...@gmail.com>
Subject Unreachable documents not deleted from solr
Date Thu, 14 Sep 2017 07:37:53 GMT
Hi.

I'm using ManifoldCF 2.x (in one vm 2.5 and 2.6 in other) and crawling a
web site to index into solr 6.

I was thinking that when checking the check box "Delete unreachable
documents" in the "Hop Filters" tab of the job, all the documents indexed
in my solr instance that have been removed or moved will be deleted because
the job can't reach them, but we have checked that the documents are still
there and I haven't seen any document deletion in the repository history
(where I can see the injected documents).

Is there any bug in ManifoldCF? Should I change something else to remove
all the unreachable documents in the web site from the solr index?.

Thanks
    Marisol

Mime
View raw message