manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Zk ManifoldCF just questions
Date Tue, 05 Jun 2018 09:11:06 GMT
Hi Maxence,

I think this will answer your questions:

(1) Multiprocess MCF is stable, yes.  Zookeeper is the recommended
configuration; shared files are deprecated.  Zookeeper is used to
coordinate cluster processes and store global configuration.
(2) Multiprocess MCF is best viewed as a cluster.  The behavior does not
change when more cluster members are added.  Documents are still processed
in the same order, just more documents can be done at once.

Karl

On Tue, Jun 5, 2018 at 4:54 AM msaunier <msaunier@citya.com> wrote:

> Hello Karl,
>
>
>
> I have just many questions.
>
>
>
> Today, I use single process ManifoldCF. To crawl, I have 5 jobs by server
> and I have a script to check if last job is finish and start the following.
> I crawl just 1 job in the same time / server.
>
>
>
> The multiprocess can be useful for a configuration like this?
>
>
>
> And a second question:
>
> What is the utility of Zookeeper for multiprocess? To distribute the
> configuration?
>
>
>
> And the last:
>
> Multiprocess ManifoldCF is stable for production env? And with Zk?
>
>
>
> Thanks,
>
> Maxence
>

Mime
View raw message