manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Issei Nishigata <duo.2...@gmail.com>
Subject Specifications of HopFilters "Keep unreachable documents"
Date Thu, 07 Nov 2019 16:00:50 GMT
Hi All,


I use MCF2.12, and I have confused about specifications of HopFilters "Keep unreachable documents".

I understand that the "Keep unrechable documents, for now" and "Keep unreacheable documents,
forever" of HopFilter
is an effective setting when specifying HopCount.

For example, crawling all data with specifying the empty value on HopCount at first time,
and the second time,
putting 0 in the value of HopCount with "Keep unreachable documents, for now", only the first
layer of the directory
will be crawled and the second and deeper layers, which are not crawled, will not be deleted
from the index.

However, when actually processing as the above setting, document on second layer is deleted
from index
when processing second time and after that. It works same way when using "Keep unreacheable
documents, forever".

Is there anything wrong with my understanding? and Does anyone know about difference between
these two settings,
"Keep unrechable documents, for now" and "Keep unreacheable documents, forever"?

If anyone of you knows about the specs of these settings, then it is very helpful to share
your bits of advice.
Any clue will be very appreciated.


Sincerely,
Issei Nishigata


Mime
View raw message