nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pierre Villard <pierre.villard...@gmail.com>
Subject Re: DistributedMapCache w/ ListSFTP and FetchSFTP
Date Thu, 15 Dec 2016 21:21:07 GMT
Hi Nicholas,

You need to configure your ListSFTP processor to only run on the primary
node (scheduling strategy in processor configuration), then to send the
flow files to a RPG that points to an input port in the cluster itself (so
that flow files are distributed over the cluster and do not stay only on
the primary node), then the FetchSFTP processor will take care of
downloading the files. The ListSFTP, with its state (DistributedCache),
ensures that you don't download the same file twice, and a given file won't
be downloaded by two nodes at the same time.

Hope this helps,
Pierre.

2016-12-15 22:13 GMT+01:00 Nicholas Hughes <nicholasmhughes.nifi@gmail.com>:

> I'm testing a simple List/Fetch setup on a 3 node cluster. I created a
> DistributedMapCacheServer controller service with the default settings (no
> SSL) and then created a DistributedMapCacheClientService that points at
> one of the cluster hostnames. The ListSFTP processor is set to use the
> Distributed Cache Service that I created.
>
> The ListSFTP processor lists the same 100 source files from the remote
> system on each node, and sends 300 Flow Files downstream to the FetchSFTP
> processor. I thought that the map cache allowed the cluster nodes to
> determine which files had already been listed by other cluster nodes...
> maybe I'm missing something.
>
> Any assistance is appreciated.
>
> NiFi version 1.0.0 in HDF 2.0.1
>
>
> -Nick
>
>

Mime
View raw message