nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Bende <bbe...@gmail.com>
Subject Re: How to configure site-to-site communication between nodes in one cluster.
Date Wed, 01 Jun 2016 14:28:09 GMT
Hello,

This post [1] has a description of how to redistribute data with in the
same cluster. You are correct that it involves a RPG pointing back to the
same cluster.

One thing to keep in mind is that typically we do this with a List + Fetch
pattern, where the List operation produces lightweight results like the
list of filenames to fetch, then redistributes those results and the
fetching happens in parallel.
In your case, if i understand it correctly, you will have already fetched
the data on the first node, and then have to transfer the actual data to
the cluster nodes which could have some overhead.

It might require a custom processor to do this, but you might want to
consider somehow determining what needs to be fetched after receiving the
HTTP request, and redistributing that so each node can then fetch from the
DB in parallel.

Let me know if this doesn't make sense.

-Bryan

[1]
https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html


On Wed, Jun 1, 2016 at 6:06 AM, Yuri Nikonovich <utagai.by@gmail.com> wrote:

> Hi
> I have the following flow:
> Receive HTTP request -> Fetch data from db -> split it in chunks of fixed
> size -> process each chunk and save it to Cassandra.
>
> I've built a flow and it works perfectly on non-clustered setup. But when
> I configured clustered setup
> I found out that all heavy work is done only on one node. So if the flow
> has started on node1 it will run to the end on node1. What I want to
> achieve is to spread data chunks fetched from DB across the cluster in
> order to process them in parallel, but it looks like Nifi doesn't send flow
> files between nodes in a cluster.
> As far as I understand, in order to make node send data to another node I
> should create a remote process group and send data to this RPG. All
> examples I could find on Internet describe RPGs as cluster-to-cluster
> communication or remote node-to-cluster communication. So for my case, I
> assume, have to create RPG pointing to the same cluster. Could you please
> provide me a guide how to do this.
>
>
> --
> Regards,
> Nikanovich Yury
>

Mime
View raw message