manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Error handling configuration
Date Mon, 04 Jun 2018 03:53:44 GMT
Hi Yasufumi,

Connector writers are required to do the following when they write
connectors:

(1) List the kinds of errors the connector may encounter that might
potentially be resolved if a document is fetched another time;
(2) Come up with a way of detecting each such error;
(3) Decide on a reasonable retry strategy for each such error resolution --
how frequently to retry, how long to retry (either as a count or as a total
elapsed time), and whether to abort the job if that retry strategy still
doesn't succeed.

It is *not* recommended to retry *all* possible errors, since some errors
(e.g. errors that occur because of bad configuration) can never be resolved.

In the case of the File Connector, this connector uses the standard java.io
package for crawling locally mounted disk drives.  Disk errors therefore
generally throw various java.io.IOException derivatives, none of which (as
far as I know) has any possibility I am aware of of succeeding if the
operation is retried.  I will therefore need much greater detail as to the
kinds of errors that you are expecting that might be resolved on a retry
attempt, and what the resolution strategy should be for each one.  Or,
since you hint at file permissions being the situation you want to address,
maybe you don't want to retry at all when a file permission is the cause,
but just skip the file?

Karl



On Sun, Jun 3, 2018 at 10:41 PM Yasufumi Mizoguchi <yasufumi0410@gmail.com>
wrote:

> Hi Karl,
>
> Thank you for your reply.
>
> I am
> using "org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector"
> for indexing my local filesystem before indexing my file servers.
> And I want ManifoldCF to retry at least once when facing any errors.
> Now, I am trying to generate errors by file permissions, but it seemed
> that MCF skipped the error. And I could not find any settings for error
> handling...
>
> Thanks,
> Yasufumi.
>
> 2018年6月1日(金) 16:20 Karl Wright <daddywri@gmail.com>:
>
>> Hi Yasufumi,
>>
>> Individual connectors determine what happens on specific kinds of errors
>> that they receive.  The connector can determine the pattern of behavior
>> based on what kind of ServiceInterruption exception it throws when the
>> error occurs.  So this is not "configurable"; the logic for decision-making
>> is part of connector code.
>>
>> What specific connectors are you using, and what errors are problematic
>> for you?
>>
>> Karl
>>
>>
>> On Thu, May 31, 2018 at 11:15 PM Yasufumi Mizoguchi <
>> yasufumi0410@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am now testing ManifoldCF v.2.10 for indexing my file server.
>>> I tried to find error handling configuration(e.g. skip/retry/abort) but
>>> there is no description in manuals.
>>> As long as I confirmed, ManifoldCF seems to skip processes on error.
>>>
>>> So, followings are my question.
>>>
>>> 1. Can I configure error handling feature on ManifoldCF?
>>> 2. If I can, how to do it?
>>>
>>> Thanks,
>>> Yasufumi
>>>
>>>
>>>

Mime
View raw message